UNCLASSIFIED 


DEFENSE  DOCUMENTATION  CENTER 

FOR 

SCIENTIFIC  AND  TECHNICAL  INFORMATION 

CAMERON  STATION  ALEXANDRIA.  VIRGINIA 
CLASSIFICATION  CHANGED 

TO  UNCLASSIFIED 
FROM  OTCUSSIFllt)  .  UMTIlpj 


UNCLASSIFIED 


HQTICI:  Uhen  fovamwit  or  other  dravlogs,  specl- 
fleatlont  or  other  deta  ere  used  for  aoy  purpose 
other  In  coniieetldn  vlth  e  definitely  related 
govemBent  proeuranent  operation,  the  U.  S. 
OijuiniBwnt  thereby  ineurs  no  responsibility,  nor  any 
obUgstlon  idiatsoerer)  and  the  fact  that  the  Govern* 
■ent  nay  have  fonsilatsd,  furnished,  or  in  any  vay 
•upplied  the  said  drawing,  apeelfications,  or  other 
data  la  not  to  be  regarded  by  i^pllcation  or  other* 
vlae  as  in  any  ■aaner  Licensing  the  holder  or  any 
other  person  or  corporation,  or  conveying  any  rl^ts 
or  pemission  to  SBimfaeture,  use  or  sell  any 
patented  invention  that  any  in  any  vay  be  related 
thereto. 


US  Oov«mm«nt  af^nclM  may  obtain  coplaa  fror*  ASTIA.  Othar  quallfiad  ASTiA  uaara  may  raquMt  through  Roma  Air  Davalopmant  Cantor.  Data  Utilization 
Branch,  RAWIP,  Qrlffiaa  AFB,  N.Y. 

This  documant  may  ba  raproducad  to  satisfy  official  naada  of  US  Oovarnmant  aganciaa.  No  othar  raproduction  authorizad  axeapt  with  parmission  of  Roma  Air 
Davalopmant  Cantor,  Data  Utilization  Branch,  RAWIP,  Qriffiss  AFB.  N.Y. 

Whan  US  Govarnmant  drawings,  spaeifications,  or  othar  data  ara  usad  for  any  purposa  othar  than  a  dafinitaly  ralatad  govammant  procuramant  oparation,  tha 
govarnmant  tharaby  incurs  no  rasponsiblllty  nor  any  obligation  whatsoavar;  and  tha  fact  that  tha  govarnmant  may  hava  formulatad,  fumishad.  or  in  any  way 
suppliad  tha  said  drawings,  spaeifications,  or  othar  data  Is  not  to  ba  ragardad  by  implication  or  otharwisa,  as  In  any  mannar  licansing  tha  holdar  or  any  othar 
parson  or  corporation,  or  convoying  any  rights  or  parmission  to  manufactura,  usa,  or  sail  any  patantad  fnvantion  that  may  in  any  way  ba  ralatad  tharato. 


RADC-TDR-63-93 


2  February  1963 


AUTOMATIC  ABSTRACTING 
C107-3U1 


TRW  COMPUTER  DIVISION 
THOMPSON  RAMO  WOOLDRIDGE  INC. 
CANOGA  PARK.  CALIFORNIA 


Contract  No.  AF  30(602)-2223 
Engineering  Change  B 


Prepared  under  the  Sponsorship  of  the 
Intelligence  Processing  Laboratory 
Rome  Air  Development  Center 
Air  Force  Systems  Command 
United  States  Air  Force 
Griffiss  Air  Force  Base 
New  York 


ACKNOWLEDGMENT 


The  TRW  team  expresses  appreciation  to  several 
members  of  the  staff  of  Rome  Air  Development  Center  for 
invaluable  assistance  in  the  performance  of  this  work. 

Mr.  John  McNamara  of  the  Data  Handling  Section  was 
Project  Engineer  on  this  study.  Mr.  A1  Leavitt  of  the 
Human  Engineering  Laboratory  assisted  in  the  area  of 
experimental  design.  Both  of  these  men  made  particu¬ 
larly  useful  contributions  to  the  project. 

Key  members  of  the  TRW  team  were  J.  L.  Kuhns, 
who  conducted  mathematical  research.  Dr.  P.  L.  Garvin 
who  assisted  on  dictionary  compilation,  L.  Ertel,  who 
performed  the  programming,  and  J.  Brewer  who  per¬ 
formed  content  analysis. 

H.  P-  Edrnundson,  Project  Manager 


ii 


ABSTRACT 


7 


Vt+hs-  ■final  Report- on  Automatic  A-batracting  presenta  a 

series  of  additions  and  r^inpments  to  the  previous  RADC 

A  V-*'  • 


contract  study/^  During  this  contoactj^j 


two 


major  results^^'^n  operating  system^and  a  research  method¬ 
ology.  The  operating  system  produces  automatic  abstracts 
via  programs  written  for  the  IBM  7090  and  the  IBM  1401. 

The  programming  involved  the  preparation  of  an  Edit 
Program  which  inputs  the  text  of  documents,  a  Cue  Dictionary 
Program  which  inputs  a  fixed  word  list,  and  an  Abstracting 
Program  which  selects  and  outputs  sentences  of  the  document. 
The  research  methodology  proceeds  from  linguistic  analysis 
of  documents  comprising  a  sample  library,  the  compilation 
of  dictionaries,  the  formulation  of  abstracting  rules  which  are 
applied  to  new  documents  of  an  experimental  library,  and 
concludes  with  testing  and  evaluation  of  the  final  program  and 
dictionaries  on  documents  of  a  test  library. 

Examples  are  given  of  abstracts  produced  by  the  four 
basic  methods.  Cue  method.  Key  method.  Title  method,  and 
Location  method.  In  addition,  a  combined  method  using  Cue- 
Title-Location  is  exemplified  as  the  preferred  method. 
■Presewtied  in  a^.acparate  volume  are  2H  examples  of  automatic 
abetraete-predueed  by-^twe^combmeri  method.  ConclusL  ons 
resulting  from  this  study  and  recommendations  for  future 
research  are  presented. 


TABLE  OF  CONTENTS 

Page 

ACKNOWLEDGEMENT .  ii 

ABSTRACT .  iii 

TABLE  OF  CONTENTS  .  iv 

LIST  OF  EXHIBITS .  vi 

1.  INTRODUCTION  AND  SUMMARY .  1 

1.  1  Structure  of  Final  Report .  1 

1.  2  The  Problem  .  2 

1.  2.  1  Goals  .  2 

1.  2.  2  Definition  of  Abstract .  2 

1.  2.  3  Specific  Tasks  .  4 

1.  3  Project  Results  .  - 

1.  3.  1  Operating  System  .  5 

1.  3.  2  Research  Methodology .  6 

1.4  Conclusions  and  Recommendations .  7 

1.  4.  1  Conclusions .  7 

1.  4.  2  Recommendations .  8 

1.  4.  3  Outlook .  9 

2.  OPERATING  SYSTEM  .  10 

2.  1  Input .  10 

2.  1.  1  Pre-editing  .  10 

2.  1.  2  Keypunching .  11 

2.  1.  3  Edit  Program .  11 

2.  1.  4  Cue  Dictionary  Program  .  12 

2.  2  Processing  .  12 

2.  2.  1  Cue  Method  .  13 

2.  2.  2  Key  Method .  15 

2.  2.  3  Title  Method  .  16 

2.  2.  4  Location  Method .  17 

2.  2.  5  Combined  Methods .  18 

2.  3  Output .  19 

2.  3.  1  System  Output .  20 

2.  3.  2  Research  Output .  21 


iv 


TABLE  OF  CONTENTS  (continued) 


Page 

3.  RESEARCH  METHODOLOGY .  23 

3.  1  The  Research  Problem . 23 

3.  2  Corpus  .  24 

3.  2.  1  Heterogeneous  Corpus .  25 

3.  2.  2  Exotic  Fuel  Corpus .  25 

3.  2.  3  Libraries .  26 

3.  3  Theory  .  27 

3.  3.  1  Definition  and  Creation  of  Abstracts  •  •  •  27 

3.  3.  2  Guiding  Principles  .  31 

3.  3.  3  The  Four  Basic  Methods .  33 

3.3.4  Characteristics,  Dictionaries,  and 

Weights .  35 

3. 4  Experiments .  42 

3.4.1  Preliminary  Test .  43 

3.  4.  2  Experimental  Cycles .  43 

3.  5  Evaluation .  47 

3.  5.  1  Rating  Procedure  .  47 

3.  5.  2  Comments  on  Evaluation  Procedures  .  .  49 

REFERENCES  .  52 

APPENDIX:  Operating  System  (bound  separately)  Section 

Pre-edit  Instructions .  1 

Keypunch  Instruction.*?  .  2 

Input  and  Output  Formats .  3 

Program  Descriptions  and  Flow  Charts .  4 

Operating  Instructions .  5 


V 


LIST  OF  EXHIBITS 


Exhibit  1. 
Exhibit  2. 
Exhibit  3. 
Exhibit  4. 
Exhibit  5. 
Exhibit  6. 
Exhibit  7. 
Exhibit  8. 
Exhibit  9. 

Exhibit  10. 

Exhibit  11. 

Exhibit  12. 

Exhibit  13. 

Exhibit  14. 

Exhibit  15. 

Exhibit  16. 

Exhibit  17. 
Exhibit  18. 
Exhibit  19. 

Exhibit  20. 
Exhibit  21. 
Exhibit  22. 
Exhibit  23. 
Exhibit  24. 
Exhibit  25. 
Exhibit  26. 


Time  and  Cost  Estimates 
Operating  System 

Reproduction  of  Pre-edited  Version  of  Document  6 
Section  of  Cue  Dictionary 
Heading  Dictionary 

Automatic  Abstract  of  Document  6  Produced  by  Cue  Method  C 

Automatic  Abstract  of  Document  6  Produced  by  Key  Method  K 

Automatic  Abstract  of  Document  6  Produced  by  Title  Method  T 

Automatic  Abstract  of  Document  Produced  by  Location 
Method  L 

Automatic  Abstract  of  Document  6  Produced  by  Combined 
Method  C-K-T-L 

Automatic  Abstract  of  Document  6  Produced  by  Combined 
Method  C-T-L 

Key  Words  of  Document  6  Ordered  by  Decreasing  Weights 
Assigned  by  Key  Method  K 

Key  Words  of  Document  6  in  Alphabetic  Order  as  Produced 
by  Key  Method  K 

Sentence  Numbers  of  Document  6  in  Textual  Order  with 
Weights  Assigned  by  Combined  Method  C-T-L 

Sentence  Numbers  of  Document  6  Ordered  by  Decreasing 
Weights  Assigned  by  Combined  Method  C-T-L 

Portion  of  Vertical  Listing  of  Document  6  Produced  by 
Combined  Method  C-K-T-L 

Research  Methodology  for  Automatic  Abstracting  Study 
Instructions  for  Abstractors 

Analysis  of  Document  6  and  Comparison  of  Human  and 
Machine  Abstracts 

Target  Abstract 

Random  Extract 

Inventory  of  Sentence  Characteristics 

Chart  of  Experiments 

Analysis  of  C-T-L  Selected  Sentences 

Instructions  for  Raters 

Rating  Form 


VI 


1.  INTRODUCTION  AND  SUMMARY 


1.  1  STRUCTURE  OF  FINAL  REPORT 

The  Final  Report  covering  this  contract  consists  of 
descriptions  of  the  research  performed,  analysis  of  results 
obtained,  copies  of  computer  flow  charts,  together  with  operating 
instructions  so  that  the  automatic  Abstracting  Program  can  be 
use-tested.  Program  decks  and  dictionary  decks  have  been 
submitted  separately. 

This  final  report  on  Automatic  Abstracting  systems  consists 
of  three  levels  of  detai,. 

The  first  level  presents  in  the  broadest  terms  the  statement 
of  the  problem  and  results  obtained  and  is  intended  for  the  reader 
who  is  interested  only  in  gross  landmarks.  Section  1  of  this  report 
treats  this  first  level  and  is  printed  on  colored  paper. 

The  second  level  cl  detail  covers  the  operating  system  and 
the  research  methodology  in  greater  detail  and  is  intended  for  the 
reader  who  has  had  some  experience  with  the  problem  of  automatic 
abstracting  or  allied  problems  such  as  automatic  translation. 
Sections  2  and  3  of  this  final  report  cover  this  material. 

The  third  and  final  level  covers  the  same  ground  as  levels  1 
and  2,  but  in  minute  detail,  and  is  intended  to  be  read  by  pro¬ 
grammers  and  computer  specialists  who  need  exact  descriptions 
of  formats  and  instructions  in  order  to  operate  the  abstracting 
system  on  a  computer.  The  Appendix  of  this  report  consists  of 
five  sections  which  give  these  fine  details. 

In  addition  to  the  preceding  three  levels  the  final  report  is 
supported  by  211  machine  abstracts  of  documents  taken  from  a 
corpus  on  exotic  fuels. 


1 


1.  2  THE  PROBLEM 


i.  2.  1  Goals 

The  purpose  of  the  contract  was  to  conduct  an  investigation 
and  study  to  develop  techniques  for  the  automatic  abstracting  of 
textual  information. 

This  present  study  has  the  following  orientation: 

(1)  Continuation  of  the  previous  research  by  developing 
new  and  modifying  previous  computer  routines  for 
automatic  abstracting. 

(2)  Experimentation  and  investigation  of  the  effectiveness 
of  the  abstracting  routines  on  new  bodies  of  text. 

(3)  Development  of  external  objective  criteria  to  evaluate 
the  different  abstracting  routines  relative  to  each  other. 

For  the  purposes  of  this  contract,  abstract  content  was 
defined  in  terms  of  subject  matter,  purpose,  methods,  conclu¬ 
sions,  generalizations,  and  recommendations;  the  assumption 
was  made  that  as  a  minimum,  such  abstract  content  would,  to 
a  large  degree,  satisfy  the  screening  function  of  abstracts.  The 
research  plan  was  to  continue  the  previous  research  by  developing 
new  and  modifying  previous  computer  routines  for  automatic 
abstracting.  These  procedures  were  tested  in  a  series  of  exper¬ 
imental  cycles,  each  consisting  of  computer  run,  analysis  of 
output  and  program  correction. 

The  tests  were  conducted  on  a  library  of  homogeneous 
subject  matter  and  the  various  abstracting  techniques  were 
evaluated  relative  to  each  other  by  external  objective  criteria 
developed  in  the  course  of  the  contract.  The  end  products  are 
detailed  working  programs  and  ratings  of  the  potential  worth  of 
the  final  methods  and  products. 

1.  2.  2  Definition  of  an  Abstract 

In  defining  an  abstract  of  a  dociiment,  we  must  specify  the 
following  three  aspects:  content,  form,  and  length.  The  problem 


2 


of  content  in  an  automatic  abstract  is  that  of  eelecting  or  rejecting 
sentence!  of  the  original  document  so  as  to  form  an  extract  or 
abstract.  The  problem  of  form  is  that  of  deciding  how  these 
sentences  so  selected  are  finally  presented  to  the  reader  in 
relation  to  the  formatting  of  the  title,  authors,  headings  and 
subheadings,  graphics,  footnotes,  and  references.  The  problem 
of  length  is  that  of  deciding  how  many  words  or  sentences  will 
constitute  the  final  output  according  to  fixed  rules,  variable  rules, 
and  thresholds  of  compactness. 

It  is  currently  believed  that  the  notion  of  the  abstract  of  a 
document  is  simple  and  generally  understood;  i.  e.  ,  to  every 
document  there  corresponds  one  abstract.  Or  to  put  it  mathe¬ 
matically,  the  abstract  A  is  a  function  of  the  document  D,  i.  e.  , 

A  =  f(D).  Moreover,  since  A  is  really  an  extract,  A  is  a  subset 
of  O,  i.  e.  ,  A£0. 

However,  on  closer  examination  we  see  that  a  dociunent  can 
have  many  abstracts  which  differ  from  one  another  in  their  in¬ 
tended  use.  Thus  the  act  of  abstracting  is  definitely  goal- 
oriented.  With  the  realization  that  it  is  misleading  to  conceive 
of  ^  abstract,  it  is  proper  now  to  speak  of  m  abstract  of  a 
document.  Thus  an  abstract  is  a  function  of  two  quantities,  the 
document  D  and  the  use  U,  i.  e. ,  A  =  f(D,  U). 

Despite  the  fact  that  the  preceding  observation  is  simple 
and  intuitively  acceptable,  its  consequences  are  neither  of  these. 
In  fact,  it  provides  the  foundation  for  the  proper  solution  to  the 
problem  of  automatic  abstracting.  Because  of  the  number  of 
alternative  uses,  it  is  necessary  to  define  abstract  content 
explicitly  in  terms  that  are  use-oriented.  This  definition  must 
be  expressed  by  machine  criteria.  This  requires  detailed 
specification  far  beyond  what  intuitively  might  have  been  expected. 
Thus,  we  eliminate  arguments  over  what  is  an  abstract  by  re¬ 
placing  useless  generalities  with  specific  operational  criteria. 

This  problem  is  closely  related  to  the  section  of  this  report 
devoted  to  evaluation.  It  involves  questions  of  the  existence  of  a 
completely  general  definition  of  an  abstract  versus  that  of  many 
specific  definitions. 


3 


This  leads  us  to  the  concept  of  a  tailor-made  abstract,  in 
the  sense  that  an  individual  will  be  able  in  future  automatic 
systems  to  specify  more  accurately  what  he  wants  in  an  abstract. 
Moreover,  this  feature  distinguishes  automatic  abstracting  from 
automatic  translation.  It  is  generally  believed  that,  aside  from 
minor  stylistic  variations,  there  is  only  one  translation  of  a 
document.  On  the  other  hand,  we  have  seen  that  a  document  can 
have  many  different  abstracts.  This  difference  is  fundamental 
to  the  problem  of  evaluating  the  quality  of  automatic  abstracts. 


1.  2.  3  Specific  Tasks 

The  general  problem  of  defining  an  abstract  must  next  be 
translated  into  a  set  of  specific  tasks  that  will  lead  to  an  oper¬ 
ating  system.  In  light  of  the  previous  research  that  produced 
automatic  abstracts  from  dociunents  in  a  heterogeneous  corpus, 
the  shift  to  a  new  computer  system  and  to  a  new  corpus  neces¬ 
sitated  carrying  out  the  following  specific  tasks: 

(1)  Convert  previously  developed  computer  routines  to 
facilitate  the  abstracting  experiments  proposed  for 
the  new  effort. 

(2)  Improve  and  modify  existing  abstracting  routines; 
develop  new  linguistic  factors  for  incorporation  in 
abstracting  routines;  design  abstracting  experiments. 

(3)  Develop  evaluation  procedure. 

(4)  Select  corpus  for  the  new  E^erimental  and  Test 
Libraries,  reproduce,  pre-edit,  and  keypimch  this 
corpus. 

(5)  Incorporate  improved  abstracting  routines  in 
computer  program. 

(6)  Manually  prepare  extracts  of  documents  from  the 
Experimental  Library;  program  mechanizable  features 
of  the  evaluation  procedures. 

(7)  Perform  abstracting  experiments  on  Experimental 
Library;  evaluate  and  modify  program. 


4 


(8)  Perform  Abstracting  experiments  on  the  Test  Library 
and  evaluate  abstracts  produced. 

(9)  Prepare  program  flow  charts  and  operating  instructions. 
1.  3  PROJECT  RESULTS 

In  accordance  with  the  goals  of  the  automatic  abstracting 
project,  this  final  report  presents  two  aspects,  an  operating 
system  and  research  methodology.  The  former  is  intended  to 
provide  a  computation  center  with  sufficient  information  to 
initiate  an  operating  system  for  purposes  of  auotmatic  abstract¬ 
ing,  while  the  latter  presents  the  research  methodology  initiated 
and  developed  during  the  contract,  whose  final  result  is  the 
operating  system. 

1.  3.  1  Operating  System 

The  operating  system  is  presented  first.  The  cardinal 
feature  of  the  operating  system  is  its  flexibility.  By  this,  we 
mean  that  the  system  has  been  parameterized  whenever  possible 
in  order  to  allow  easy  modification  by  other  users  and  for  other 
purposes.  This  has  been  accomplished  by  making  it  possible  to 
modify  both  externally  and  internally  stored  word  lists,  to  re¬ 
adjust  weights  assigned  to  words,  to  permit  the  use  of  15 
combinations  of  the  four  basic  methods  of  abstracting,  and  to 
alter  the  truncation  thresholds  which  determine  the  length  of  an 
automatic  abstract. 

The  operating  system  comprises: 

(1)  Two  Corpora 

Heterogeneous  Corpus:  200  documents  comprising 
several  different  subject  fields  (Physical  Science, 

Life  Science,  Humanities,  and  Information  Science). 

Exotic  Fuel  Corpus:  200  ASTIA  documents 
concerning  the  chemistry  of  exotic  fuels. 

Both  corpora  exist  on  pimched  cards  and  magnetic 
tape. 


5 


(2)  Four  Computer  Programs  for  Automatic  Abstracting 

Cue  Routine:  weights  sentences  according  to  match 
of  text  words  with  Cue  Dictionary. 

Key  Routine:  weights  sentences  according  to 
frequency  of  word  occurrence. 

Liocation  Routine:  weights  sentences  according  to 
their  location  in  document  and  matches  text  words 
with  Heading  Dictionary. 

Title  Routine:  weights  sentences  according  to  match 
of  text  words  with  title  and  subheading  words. 

These  programs  were  designed  with  experimentation 
in  mind. 

(3)  Computer  Routines  to  Facilitate  Experimentation 

Concordance  Program:  a  program  to  generate  a 
concordance  of  a  document  (with  context  displayed 
in  the  output)  and  classify  words  and  sentences 
according  to  the  number  of  times  they  were  used  in 
manual  extracts  (see  Reference  7). 

Cue  Dictionary  Program,  a  program  to  feed  the 
Cue  Dictionary  into  the  Abstracting  Program. 

1.  3.  2  Research  Methodology 

This  Final  Report  stresses  the  research  performed  during 
the  second  phase  of  the  automatic  abstracting  contract.  It  will  be 
recalled  that  the  corpus  used  in  the  first  phase  of  the  contract  was 
heterogeneous  in  that  its  documents  came  from  the  fields  of 
political  science,  sociology,  political  affairs,  astronomy,  physics, 
etc.  On  the  other  hand,  the  corpus  examined  during  the  second 
phase  of  the  contract  was  strictly  homogeneous  in  that  all  the 
documents  therein  concerned  the  chemistry  of  exotic  fuels. 

This  switch  to  the  Exotic  Fuel  Corpus  necessitated  some 
changes  in  the  words  in  the  basic  Cue  Dictionary.  Moreover,  since 
these  exotic  fuel  documents  were  contract  reports  obtained  from 
ASTIA  they  were  on  the  average  more  uniform  with  regard  to  size, 
format,  and  organization  than  documents  of  the  earlier  heteroge¬ 
neous  corpus.  The  factors  of  uniform  length,  shorter  length. 


6 


standard  form,  and  technical  writing  style  might  cause  the 
curious  reader  to  wonder  if  the  system  originally  based  on  a 
heterogeneous  corpus  would  also  serve  for  this  homogeneous 
one  and  secondly,  whether  the  set  of  changes  based  on  the 
Exotic  Fuel  Corpus  would  make  it  impossible  for  the  final 
operating  system  to  handle  properly  the  heterogeneous  corpus. 

The  answer  to  these  questions  cannot  be  completely  given  at 
this  time;  however,  experimentation  will  settle  the  matter.  It 
should  also  be  mentioned  that  even  while  relying  on  the  factors 
noted  above,  constant  attention  was  paid  toward  producing  an 
operating  system  which  would  work  equally  well  for  both  the 
Exotic  Fuel  Corpus  and  any  heterogeneous  corpus.  Determination 
of  the  extent  to  which  this  is  true  awaits  further  experimentation. 

The  research  methodology  comprises; 

(1)  A  study  of  the  extracting  behavior  of  humans. 

(2)  A  general  formulation  of  the  abstracting  problem  and 
its  relation  to  the  problem  of  evaluation.  Four  methods 
of  evaluation  that  have  been  studied  are  statistical 
correlations,  information  content  tests,  retrievability 
tests,  and  judges'  ratings  (see  Reference  7). 

(3)  A  mathematical  and  logical  study  of  the  problem  of 
assigning  ranking  numbers  to  sentences. 

(4)  A  set  of  abstracting  experiments  based  on  cyclic 
improvement. 


1.  4  CONCLUSIONS  AND  RECOMMENDATIONS 

1.  4.  1  Conclusions 

The  conclusions  of  this  final  report  will  be  listed  under  two 
headings;  operating  system,  and  research  methodology. 

Operating  System 

(1)  An  operating  system  has  been  developed  which 
produces  automatic  abstracts  on  the  IBM  7090 
computer  by  four  distinct  methods. 


7 


(2)  The  system  abstracts  technical  documents  whose 
lengths  do  not  exceed  approximately  3000  words. 

(3)  The  computer  abstracts  at  the  rate  of  at  least  7800 
words  per  minute  on  a  corpus  of  29,  500  words  (see 
Exhibit  1). 

(4)  The  total  system  cost  (edit,  abstracting,  system  output) 
is  approximately  1.  5  cents  per  word  of  which  keypunching 
costs  1.  0  cents  per  word  (see  Exhibit  1). 

(5)  The  operating  system  produces  abstracts  of  sufficiently 
high  quality  to  satisfy  the  screening  function. 

Research  Methodology 

(1)  A  definition  of  an  abstract  has  been  developed  which 
leads  to  a  more  uniform  target  abstract  prepared  by 
htimans  and  permits  the  creation  of  automatic  abstracts 
based  on  machine-recognizable  clues. 

(2)  The  techniques  for  pre-editing  and  keypunching  a  new 
corpus  have  been  rendered  routine. 

(3)  Investigation  has  resulted  in  a  research  methodology 
which  can  be  applied  in  the  development  of  additional 
dictionaries,  weighting  systems,  and  evaluation 
techniques. 

(4)  The  principle  of  attaining  flexibility  through  parame¬ 
terization  has  been  verified. 

(5)  The  method  of  research,  by  means  of  experimental 
cycles,  has  been  shown  to  be  effective  and  reliable. 

(6)  A  method  of  evaluation  by  judges  ratings  of  similarity 
has  been  developed.  The  resulting  evaluation  shows 
that  the  machine  abstracts  have  a  66^  degree  of 
similarity  with  the  target  abstracts. 


1.  4.  2  Recommendations 


The  recommendations  of  this  final  report  will  also  be  listed 
tmder  two  headings. 

Operating  System 

(1)  That  the  operating  system  be  modified  so  as  to  abstract 
documents  up  to  approximately  10,  000  words  in  length. 


8 


ABSTRACTING  METHODS 


Cost 

( dolla  r/wo  r  d) 

.  0003 

.  01 

Time 

(min/word) 

.  01 

.  03 

m 

Pre-edit 

Keypunch^ 

H 

t3 

PU 

H 

D 

O 

Q 

Z 

c 


u 

u 

o 

oi 


■8 

H 

Cost 

(dollar/word) 

.  0030 

.  0013 

.  0040 

Time 

(min/word) 

.  0024 

. 00057 

.  0030 

o 

T* 

U 

^  0 

IT)  ^ 

n  u 

0  (t 

0 

.  0019 

. 00038 

.  0023 

Time 

(min/word) 

.  0023 

.  00046 

.  0028 

7090 

Cost^"^^ 

(dollar/word) 

.  0011 

. 00091 

.  0017 

'd 

S2-  o 

4>  ^ 

r 

1 

. 00013 

.  00011 

.  00020 

Operation 

Edit 

Abstracting^^^ 
(Method  C-T-L) 
Without 
Vertical 
Listing 

With 

Vertical 

Listing 

>4 

n 

<) 

(3 

•  H 

a 

o 

T3 

•0 

< 


>4  U  C 

S.®*3 

”rri  « 
0(0. 
<0  00  W 


•5  .3  -o  o  -  .a 

M  **  l«  O'O 
w  A)  O  ^  CD 


•■rt  N  (O 


'♦mo 


Exhibit  1.  Time  and  Cost  Estimates 


(2)  That  the  operating  system  be  reprogrammed  as  a  more 
efficient  program  in  order  to  reduce  operating  time  and 
costs. 

(3)  That  the  method  of  inputting  text  be  improved  as  to  cost 
and  speed. 

(4)  That  the  operating  system  be  augmented  to  handle 
abstracting  clues  found  in  phrases,  clauses,  captions, 
footnotes,  and  references. 

Research  Methodology 

(1)  That  any  future  research  adhere  to  the  principle  of 
parameterization  in  order  to  gain  flexibility. 

(2)  That  the  problem  of  capturing  chemical  and  mathe¬ 
matical  symbols  in  machine  form  be  investigated 
more  thoroughly. 

(3)  That  investigations  be  made  so  as  to  reduce  both  the 
number  and  size  of  experimental  cycles  via  more 
powerful  statistical  techniques. 

(4)  That  further  research  and  experimentation  be  con¬ 
ducted  on  evaluation  methods  in  order  to  increase 
their  speed,  simplicity,  and  discrimination. 


1.  4.  3  Outlook 

In  spite  of  the  problems  highlighted  earlier  it  is  felt  that 
automatic  abstracts  can  be  defined,  programmed,  and  produced 
in  an  operational  system  so  as  to  compete  with  present  human 
abstracting.  That  the  future  automatic  abstracts  may  be 
different  both  in  content  and  appearance  from  classical  ones 
seems  clear.  However,  it  is  not  thought  that  users  will  be 
materially  inconvenienced  in  having  to  adapt  to  a  new  format. 
Further  research  needs  to  be  performed  in  this  area  of 
linguistic  data  processing,  but  the  general  outlines  of  the  goal 
are  being  seen  clearly  for  the  first  time. 


9 


2.  OPERATING  SYSTEM 


By  system  aspects  we  refer  to  the  functional  specification  of 
each  of  the  steps  in  the  automatic  abstracting  system.  In  general 
these  steps  are;  (1)  editing  of  the  textual  input,  (2)  selecting 
abstracting  method,  and  (3)  inputting  of  dictionary.  Further  details 
are  shown  in  Exhibit  2. 

The  operating  system  consists  of  three  main  programs: 

(1)  The  Edit  Program  consists  of  1707  instructions  and 
occupies  30K  of  7090  core  using  STL's  system  B. 

(2)  The  Cue  Dictionary  Program  tape  consists  of  704 
instructions  and  uses  23K  of  7090  core. 

(3)  The  Abstracting  Program  has  2427  instructions  and 
uses  32K  of  7090  core. 

2.  1  INPUT 

2.  1.  1  Pre-editing 

Certain  subject  fields  introduce  difficulties  in  the  pre-editing 
step,  since  at  the  present  time  we  must  keypunch  the  original  docu¬ 
ment.  Moreover,  even  if  a  print  reader  were  available,  it  would 
probably  not  be  able  to  read  a  sufficient  number  of  fonts  for  many 
years  to  come.  This  means  that  the  text  must  be  manually  pre¬ 
edited  according  to  a  set  of  Pre-edit  Instructions  (see  Appendix, 
Section  1).  The  creation  of  these  instructions  is  not  trivial  because 
it  is  precisely  at  this  step  that  we  may  choose  to  retain  or  ignore 
critical  content  and  format  clues  which,  once  lost,  can  never  be 
restored  by  any  programming  tricks.  The  Pre-edit  Instructions 
must  cover  problems  of  formatting,  graphics,  special  symbols, 
special  alphabets.  As  an  example.  Exhibit  3  presents  a  reproduc¬ 
tion  of  the  pre-edited  version  of  Document  6  from  the  Exotic  Fuel 
Corpus.  The  notations  entered  are  defined  and  discussed  in  the 
Appendix,  Section  1. 


10 


Note:  Heavy  boxes 


denote  machine 
operations . 


Exhibit  2.  Operating  System 


Exhibit  3.  Reproduction  of  Pre-edited  Version  of  Document  6 


Exhibit  3.  Reproduction  of  Pre-edited  Version  of  Document  6  (Continued) 


2.  1.  2  Keypunching 

Despite  the  fact  that  keypunch  operators  quickly  adapt  to  new 
challenges  it  is  necessary  to  prepare  a  set  of  Keypunch  Instructions 
(see  Appendix,  Section  2).  These  instructions  are  based  upon  the 
Pre-edit  Instructions  and  are  subject  to  the  boundary  conditions 
imposed  by  present  input  and  output  hardware.  They  contain  rules 
of  sufficient  generality  to  cover  a  wide  variety  of  textual  situations 
and  should  also  be  supported  by  appropriate  examples.  The  purpose 
of  these  Keypunch  Instructions  is  to  relieve,  to  the  maximum  extent 
possible,  the  keypunch  operator  of  making  decisions. 

The  manually  pre-edited  text  is  put  into  machine-readable 
form  by  the  following  sequence  of  operations: 

(1)  Check  pre-editing  by  scanning  of  document. 

(2)  Keypunch  one  line  of  text  to  a  card  on  IBM  024  at 
100  cards  per  hour. 

(3)  Keyverify  on  IBM  056, 

(4)  Sequence  cards. 

(5)  Interpret  cards. 

(6)  List  in  2  parts. 

(7)  Spot  check  original  document  against  machine  listing. 

2,  1.  3  Edit  Program 

The  Edit  Program  creates  the  text  tape  which  is  used  by  the 
Abstracting  Program 

Text  written  in  free  format  is  punched  on  cards  according  to 
the  Keypunch  Instructions  (see  Appendix,  Section  2).  The  Edit 
Program  interprets  these  cards,  recognizing  title,  heading,  and 
author  cards;  paragraphs,  sentences,  number  of  words  in  a  sentence, 
punctuation,  capitalization,  etc.  Each  text  word  takes  up  five  com¬ 
puter  words  (see  Appendix,  Section  3).  A  total  of  1023  text  words 


11 


make  up  a  record  on  the  BCD  output  tape.  The  first  word  of  each 
record  contains  the  binary  count  of  text  words  in  a  record.  The 
last  five  words  of  each  record  contain  zero,  therefore,  each  record 
of  the  output  tape  contains  5121  BCD  words. 

The  routine  also  recognizes  input  errors  (see  Output  Error 
List,  .Appendix,  Section  5).  The  errors  are  printed  on  the  system 
output  tape  along  with  a  vertical  listing  of  text. 

2.  1.  4  Cue  Dictionary  Program 

The  Cue  Dictionary  Program  creates  a  BCD  dictionary  tape 
to  be  used  by  the  Abstracting  Program  when  it  is  determining  Cue 
and  Key  words . 

The  input  to  the  routine  consists  of  at  most  1000  words  and 
their  weights;  Bonus  words  have  positive  integer  weights  <99,  Null 
words  have  zero  weights,  and  Stigma  words  have  negative  integer 
weights  >  -99.  The  words  are  punched  one  per  card  (see  Input 
Format,  Appendix,  Section  3).  The  word.s  must  be  in  alphabetical 
order. 

The  output  consists  of  a  BCD  tape  which  contains  only  one 
record.  Each  entry  takes  four  computer  words  (described  in  Output 
Format  page, Appendix,  Section  3),  Certain  input  errors  can  be 
detected,  and  a  notice  will  be  printed  on  the  system  output  tape. 
Errors  are  described  in  the  Appendix,  Section  5.  A  BCD  listing 
of  the  output  tape  is  also  output  on  the  system  output  tape. 


2.2  PROCESSING 


Programs 


It  is  useful  to  separate  the  total  system  into  three  major 
operating  programs:  Edit  Program,  Cue  Dictionary  Program,  and 
Abstracting  Program.  In  addition  to  these  operating  programs 
various  research  programs  were  written  during  the  research  phase. 


12 


Based  upon  the  theoretical  model  or  structure  underlying  the 
abstracting  system,  decisions  were  made  how  best  to  use  a  i^ix- 
ture  of  computing  routines  and  table-lookup  routines.  The  abstract¬ 
ing  system  provides  for  the  modification  of  the  various  parameters 
that  are  incorporated  in  the  programming  steps  or  that  are  stored 
in  the  tables.  This  allows  discoveries  made  during  research  to  be 
transformed  easily  into  improvements  in  the  computer  program. 

Tables 

The  success  of  an  automatic  abstracting  system  depends 
materially  upon  two  different  aspects.  The  first  is  the  general 
system  of  abstracting  as  given  by  the  sequence  of  programming 
operations.  An  example  is  the  Cue  method.  The  second  aspect 
is  the  specific  entries  in  stored  tables.  An  example  of  a  stored 
table  is  the  Cue  Dictionary  of  1000  words  that  act  either  as  Bonus 
words  which  signal  the  importance  of  a  sentence  or  as  Stigma  words 
that  signal  the  nonsignificance  of  a  sentence.  Such  a  table  includes, 
in  addition  to  the  word,  a  code  indicating  its  semantic  function  and 
its  importance  weight.  Another  kind  of  table  is  set  aside  to  retain 
the  title  author,  and  section  headings .  The  programmer  is  pre¬ 
sented  with  the  considerable  problem  of  juggling  sections  of  the 
core  memory  so  as  to  accommodate  the  input  text,  the  program, 
and  the  tables. 

2.  2.  1  Cue  Method 

The  Cue  method  has  as  its  source  of  machine  recognizable 
clues  the  general  characteristics  of  the  corpus  that  are  provided 
by  the  body  of  the  documents  (see  Section  3,  Rationale  of  the  Four 
Basic  Methods).  The  Cue  method  is  based  on  a  Cue  Dictionary  of 
certain  function  words  apt  to  appear  in  the  body  of  the  document 
(see  Section  3,  Rationale  of  the  Four  Word  Lists).  There  are  three 
classes  of  Cue  words  in  the  Cue  Dictionary:  Bonus  words,  Stigma 


13 


words,  Null  words.  Thus,  the  Cue  Dictionary  can  be  conceived 
as  consisting  of  three  dictionaries:  the  Bonus  Dictionary,  Stigma 
Dictionary,  and  Null  Dictionary.  Bonus  words  are  defined  as  those 
words  of  the  Cue  Dictionary  that  indicate  that  the  sentence  in  which 
they  appear  should  be  in  the  abstract;  therefore,  Bonus  words  are 
assigned  positive  weights.  Stigma  words  are  defined  as  those  words 
of  the  Cue  Dictionary  that  indicate  that  the  sentence  in  which  they 
appear  should  not  be  in  the  abstract;  therefore,  Stigma  words  are 
assigned  negative  weights.  Null  words  are  defined  as  those  words 
of  the  Cue  Dictionary  which  are  irrelevant  to  selection  of  the  sen¬ 
tence  for  the  abstract;  therefore.  Null  words  are  assigned  zero 
weights.  The  Null  Dictionary  was  created  and  maintained  for  two 
reasons:  (1)  it  constitutes  a  dictionary  of  common  words  that  may 
upon  further  research  be  transferred  to  the  Bonus  or  Stigma 
Dictionaries  for  the  Cue  method;  (2)  words  that  do  not  match  this 
dictionary  are  made  candidates  for  special  attention  in  the  Key, 

Title,  and  Location  methods. 

To  illustrate  the  content  (words  and  weights)  and  format  of 
the  Cue  Dictionary  a  portion  of  it  is  presented  in  Exhibit  4. 

The  computation  of  sentence  weights  by  the  Cue  method  is 
detailed  in  the  following  steps  which  also  show  the  present  choice 
of  parameter  values: 

(1)  Compare  each  word  of  text  with  the  Cue  Dictionary. 

(2)  Tag  all  Bonus  words  with  the  weight  b.  (  =  +  10). 

Tag  all  Stigma  words  with  the  weight  s^  (  =  -  10). 

Tag  all  Null  words  with  the  weight  n.  (  =  0). 

(3)  Compute  the  Cue  weight  C  of  each  sentence  by  summing 
its  Cue -word  weights  b.,  s^,  and  n^, 

(4)  Rank  all  sentences  in  decreasing  weight  order. 

(5)  Select  all  sentences  whose  rank  order  is  less  than  the 
threshold  S(=  25  percent  of  the  total  number  of  sentences). 

(6)  Select  all  headings. 


14 


o 


tu 

o 


a 


o 


> 

u  1- 

</l 

z 

\  <  <  </> 

z 

z 

>• 

z 

z 

^  K 

o 

o 

3 

ee 

z 

3 

z  z  z  z 

o  z 

M 

a 

UI 

M 

>• 

4JJ 

M 

2  UJ  UJ  Ui  UJ 

UJ  M  (A 

►» 

> 

o 

•J 

-j 

UJ 

iA 

> 

a 

UJZXZZZZZZ 

< 

< 

UJ 

»>« 

z 

UJ 

UJ 

UJ 

oe 

z 

CO 

U) 

o 

o 

z 

M  M  M  M 

z 

Z 

z 

z 

lA 

9 

X 

z 

a 

z 

UJ 

>• 

z 

z 

tt 

4AJ 

UJ 

z 

z 

0^ 

a 

o  z  oe  z  at 

<  <  <  < 

< 

< 

< 

5 

Z 

UJ 

UJ 

UJ 

5 

K 

z 

u 

z 

z 

z 

z 

M 

a 

UJ 

3 

UJ  Mi  lit  III  III 

^  -1  -1  -1 

«J 

UJ 

z 

z 

z 

z 

z 

a 

a 

o 

o 

4/> 

3 

3 

o 

o 

o 

«/) 

a 

z 

z 

a  a  a.  a  & 

a  a  a  a 

a 

a 

a 

a 

l> 

M 

M 

«■« 

M 

z 

4/> 

> 

> 

z 

s 

t9 

o 

19 

z 

z 

z 

z 

> 

z 

z 

z 

X  K  K  M  X 

X  X  X  X 

X 

X 

X 

X 

X 

X 

X 

X 

< 

< 

< 

z 

z 

z 

z 

z 

UJ 

a 

M 

kii 

M 

M 

a 

o 

o 

UJ  UJ  UJ  iu  lU 

UJ  UJ  UJ  UJ 

UJ 

UI 

UJ 

UJ 

UJ 

UJ 

UI 

UJ 

VL 

u. 

4^ 

a 

4k 

4L 

4L 

4L 

a 

UL 

a 

a 

41. 

4k 

4k 

a 

Ua 

u. 

a 

UL 

u. 

U 


o 

< 


I 

I 

I 


Z 

z 

z 

z 

z  z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

z 

a 

o 

O 

o 

o 

o 

o  o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

Z 

o 

O 

o 

o 

o 

o  o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

3 

o 

O 

o 

o 

o 

o  o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

O 

o 

O 

o 

o 

o 

o  o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

u 

3 

o 

o 

o 

o 

o  o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

0-0 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

p 

o 

o 

o 

o 

o 

o 

p 

o 

o 

o 

X 

tmt 

wj 

•4 

p.4 

a 

«i4 

a 

a 

a 

a 

a 

*4 

a 

p4 

a 

a 

fH 

a 

a 

a 

a 

a 

a 

p 

p 

a 

13 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

p 

o 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

5 

o 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

UI 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

3 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

1 

o 

o 

o 

o 

o 

o 

o 

o 

o 

1 

o 

o 

p 

p 

p 

p 

t 

1 

1 

o 

p 

p 

p 

p 

o 

1 

o 

L) 

fU 

fU 

(M 

CM 

cu 

IM 

rg 

eg 

fM 

M 

fU 

rg 

N 

rg 

rg 

rj 

rg 

rg 

rg 

rg 

fSJ 

rg 

rg 

rg 

(M 

rg 

rg 

rg 

rg 

rg 

rg 

rg 

M 

fM 

rg 

fiO 

a 

a 

a 

*4 

a 

a 

*4 

a 

a 

*4 

a 

a 

p4 

a 

a 

a 

a 

a 

|i4 

a 

*4 

a 

a 

a 

a 

a 

*4 

#4 

a 

ei4 

*4 

a 

p 

p 

p 

p 

p 

p 

p 

o 

p 

p 

p 

o 

p 

p 

p 

p 

p 

o 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

1 

p 

p 

p 

p 

p 

o 

p 

o 

p 

p 

p 

o 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

o 

p 

p 

p 

p 

p 

p 

o 

p 

p 

o 

o 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

o 

p 

p 

p 

p 

p 

p 

o 

o 

o 

o 

o 

o 

o 

p 

o 

p 

o 

p 

p 

p 

o 

p 

p 

o 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

p 

«/» 


Exhibit  4.  Section  of  Cue  Dictionary 


(7)  Merge  the  selected  sentences  under  their  proper  headings. 

(8)  Output  title,  authors  and  the  results  of  step  (7). 

2.2.2  Key  Method 

The  Key  method  has  as  its  source  of  machine-recognizable 
clues  the  specific  characteristics  of  the  body  of  the  document  (see 
Section  3,  Rationale  of  the  Four  Basic  Methods).  The  Key  method 
is  based  on  a  Key  Glossary  of  content  words  taken  from  the  body 
of  the  document  (see  Section  3,  Rationale  of  the  Four  Word  Lists). 
Key  words  which  appear  in  the  Key  Glossary  are  created  for  each 
document  to  be  abstracted  by  making  statistical  counts  of  the  fre¬ 
quency  of  occurrence  of  all  words  in  the  document  except  those 
that  are  found  in  the  fixed  Cue  Dictionary.  The  words  resulting 
are  ranked  in  decreasing  frequency  order  and  all  those  appearing 
above  a  pre-established  threshold  are  defined  as  Key  words. 

The  computation  of  sentence  weights  by  the  Key  method  is 
detailed  in  the  following  steps  which  also  show  the  present  choice 
of  parameter  values; 

(1)  Compare  each  word  of  text  with  the  Cue  Dictionary . 

(2)  Create  table  of  distinct  nonmatching  text  words  (i.  e. , 

Key -word  candidates). 

(3)  Compute  frequency  k.  of  each  Key-word  candidate. 

(4)  Sort  table  in  decreasing  frequency  order. 

(5)  Create  Key  Glossary  from  all  Key-word  candidates 
whose  total  frequency  (when  summed  in  decreasing 
frequency  order)  exceeds  the  threshold  W(=  10  percent 
of  the  total  number  of  word  occurrences  in  document. 

(6)  Compare  each  text  word  with  the  Key  Glossary  and  tag 
each  Key  word  in  text  with  a  weight  k.  (  =  frequency  of 
occurrence). 

(7)  Compute  the  Key  weight  K  of  each  sentence  by  summing 
its  Key-word  weights  k^. 


15 


(8)  Rank  all  sentences  in  decreasing  weight  order. 

(9)  Select  all  sentences  whose  rank  order  is  less  than  the 
threshold  S(  s  25  percent  of  the  total  number  of  sentences). 

(10)  Select  all  headings. 

(11)  Merge  the  selected  sentences  under  their  proper  headings. 

(12)  Output  title,  authors,  and  the  results  of  step  (11). 

2.  2.  3  Title  Method 

The  Title  method  has  as  its  source  of  machine-recognizable 
clues  the  specific  characteristics  of  the  skeleton  of  the  document, 
i.  e.  ,  title,  headings,  format,  (see  Section  3,  Rationale  of  the  Four 
Basic  Methods).  The  Title  method  is  based  on  a  Title  Glossary 
comprising  those  content  words  found  in  the  title,  subtitles,  and 
headings  of  the  document  to  be  abstracted,  but  excluding  words  of 
the  Null  Dictionary  (see  Section  3,  Rationale  of  the  Four  Word 
Lists). 

The  computation  of  sentence  weights  by  the  Title  method  is 
detailed  in  the  following  steps  which  also  show  the  present  choice 
of  parameter  values; 

(1)  Compare  each  word  of  title  and  headings  with  Null 
Dictionary. 

(2)  Create  Title  Glossary  of  nonmatching  words. 

(3)  Tag  all  words  of  the  Title  Glossary  with  the  weight 
tj  (=  11)  if  they  are  words  of  the  title. 

Tag  all  words  of  the  Title  Glossary  with  the 
weight  t^  (=  7)  if  they  are  words  of  a  heading. 

(4)  Compare  each  word  of  text  with  Title  Glossary  and  tag 
each  matching  word  with  its  Title  weight  t^  or  t^. 

(5)  Compute  the  Title  weight  T  of  each  sentence  by  summing 
its  title  word  weights  t^  and  t^. 

(6)  Rank  all  sentences  in  decreasing  weight  order. 


16 


(7)  Select  all  sentences  whose  rank  order  is  less  than  the 
threshold  S(  =  25  percent  of  the  total  number  of 
sentences). 

(8)  Select  all  headings. 

(9)  Merge  the  selected  sentences  under  their  proper  headings. 

(10)  Output  title,  authors,  and  the  results  of  step  (9). 

2.  2.  4  Location  Method 

The  Location  method  has  ae  its  source  of  machine-recognizable 
clues  the  general  characteristics  of  the  corpus  that  are  provided  by 
the  skeletons  of  documents,  i.e.,  title,  headings,  format  (see 
Section  3,  Rationale  of  the  Four  Basic  Methods).  The  Location 
method  is  based  on  a  Heading  Dictionary  of  certain  function  words 
that  appear  in  skeletons  of  documents  (see  Section  3,  Rationale  of 
the  Four  Word  Lists).  Words  of  the  Heading  Dictionary  together 
with  their  weights  are  presented  in  Exhibit  5.  In  addition  to  assign¬ 
ing  weights  provided  by  the  Heading  Dictionary,  the  Location  method 
also  assigns  weights  to  sentences  according  to  their  ordinal  position 
in  the  text,  i.e.,  first  and  last  paragraphs,  and  first  and  last 
sentences  of  paragraphs.  The  final  Location  weight  for  each  sen¬ 
tence  is  the  sum  of  the  Heading  weight  and  the  Ordinal  weight. 

Computation  of  sentence  weights  by  the  Location  method  is 
detailed  in  the  following  steps  which  also  show  the  present  choice 
of  parameter  values; 

(1)  Compare  each  word  of  title  and  headings  with  Null 
Dictionary. 

(2)  Create  table  of  nonmatching  words. 

(3)  Compare  table  with  Heading  Dictionary. 

(4)  Tag  each  matching  word  with  weight  h.  found  in  Heading 
Dictionary. 

(5)  Compute  the  Heading  weight  H  of  each  heading  by  summing 
its  Heading-word  weights  h^. 


17 


(6)  Tag  each  sentence  with  the  Heading  weight  H  of  its 
heading. 

(7)  Tag  each  sentence  of  the  first  paragraph  with  an  Ordinal 
weight  Oj  (=  18) 

Tag  each  sentence  of  the  last  paragraph  with  an  Ordinal 
weight  ©2  {=  18) 

Tag  the  first  sentence  of  every  paragraph  with  an 
Ordinal  weight  O^  (=  9) 

Tag  the  last  sentence  of  every  paragraph  with  an 
Ordinal  weight  O^  (=  9) 

(8)  Compute  the  Ordinal  weight  O  of  each  sentence  by 
summing  its  ordinal  weights  O^ . 

(9)  Compute  the  Location  weight  L  of  each  sentence  by 
summing  its  Heading  weight  H  and  Ordinal  weight  O. 

(10)  Rank  all  sentences  in  decreasing  weight  order. 

(11)  Select  all  sentences  whose  rank  order  is  less  than  the 
threshold  S(  =  25  percent  of  the  total  number  of  sentences). 

(12)  Select  all  headings. 

(13)  Merge  the  selected  sentences  under  their  proper  headings. 

(14)  Output  title,  authors,  and  the  results  of  step  (13). 


2.  2.  5  Combined  Methods 


The  four  basic  methods  described  above  yield  15  combined 
methods.  They  have  in  common  the  following  sequence  of  final 
machine  steps: 

(1)  Rank  all  sentences  in  decreasing  weight  order. 

(2)  Select  the  sentences  whose  rank  order  is  less  than  the 
threshold  S(  =  25  percent  of  the  total  number  of  sentences). 

(3)  Select  all  headings. 

(4)  Merge  the  selected  sentences  under  their  proper  headings. 

(5)  Output  title,  authors,  and  the  results  of  step  (4). 


18 


Heading  Word 

Weight 

Heading  Word 

Weight 

1. 

abstract 

90 

46. 

nomenclature 

50 

2. 

aim 

90 

47. 

object 

90 

3. 

aims 

90 

48. 

objective 

90 

4. 

analysis 

10 

49. 

objectives 

90 

5. 

approach 

30 

50. 

observations 

10 

6, 

background 

50 

51. 

performance 

10 

7. 

comparison 

20 

52. 

preliminary 

20 

8. 

concluding 

90 

53. 

problem 

90 

9. 

conclusion 

90 

54. 

problems 

90 

10. 

conclusions 

90 

55. 

program 

30 

11. 

consequence 

90 

56. 

progress 

40 

12. 

consequences 

90 

57. 

project 

90 

13. 

consideration 

30 

58. 

projects 

90 

14. 

considerations 

30 

59. 

properties 

10 

15. 

data 

10 

60. 

purpose 

90 

16. 

description 

20 

61. 

purposes 

90 

17. 

descriptions 

20 

62. 

recommendation 

90 

18. 

design 

20 

63. 

recommendations 

90 

19. 

determination 

20 

64. 

recommended 

90 

20. 

determinati  ons 

20 

65. 

remarks 

60 

21. 

development 

10 

66. 

requirements 

50 

22. 

developments 

10 

67. 

results 

30 

23. 

discussion 

30 

68. 

review 

40 

24. 

effect 

20 

69. 

scope 

50 

25. 

effects 

20 

70. 

significance 

90 

26. 

evaluation 

30 

71. 

studies 

10 

27. 

evaluations 

30 

72. 

study 

10 

28. 

extension 

90 

73. 

subject 

90 

29. 

extensions 

90 

74. 

subjects 

90 

30. 

finding 

90 

75. 

suggested 

90 

31. 

findings 

90 

76. 

suggestion 

90 

32. 

foreword 

20 

77. 

suggestions 

90 

33. 

future 

50 

78. 

summary 

90 

34. 

generalization 

90 

79. 

symbols 

20 

35. 

generalizations 

90 

80. 

technical 

10 

36. 

goal 

90 

81. 

technique 

50 

37. 

goals 

90 

82. 

techniques 

20 

38. 

implication 

90 

83. 

test 

10 

39. 

implications 

90 

84. 

testing 

10 

40. 

introduction 

60 

85. 

tests 

10 

41. 

introductory 

50 

86. 

theoretical 

30 

42. 

investigation 

30 

87. 

theory 

30 

43. 

measurement 

10 

88. 

topic 

90 

44. 

method 

50 

89. 

topics 

90 

45. 

methods 

20 

90. 

work 

20 

Exhibit  5.  Heading  Dictionary 


It  would  be  expected  that  of  these  15,  the  Cue-Key-Title- 
Location  combined  method  would  produce  the  best  abstract,  since 
all  of  the  four  categories  of  machine- recognizable  clues  come  into 
play.  However,  the  experimental  data  show  that  the  combined  Cue- 
Title-Location  method,  excluding  the  Key  method,  gives  as  good 
or  better  abstracts  as  the  combination  of  all  four  methods.  The 
decision  to  abandon  the  Key  method  is  reinforced  by  the  considera¬ 
tion  that  upon  reprogramming  considerable  computer  storage  will 
be  saved  by  omitting  the  Key  routine.  The  experimental  data  under¬ 
lying  this  decision  are  given  in  Section  3.  The  data  support  a  con¬ 
jecture  made  during  the  previous  study  that  Key  words,  while 
important  for  indexing,  are  not  as  important  for  abstracting. 

2.3  OUTPUT 

Hardware 

As  in  the  case  of  input  we  are  here  confronted  with  unfortunate 
restrictions  imposed  by  output  hardware.  Despite  the  fact  that  high 
speed  printers  are  available,  the  most  serious  difficulty  is  that  of 
an  overly  restricted  number  of  type  fonts.  This  forces  us  to  replace 
strings  of  unusual  symbols,  e.  g.,  mathematical  and  chemical,  by 
the  few  conventional  symbols  available  at  the  output  printer.  More¬ 
over,  we  are  forced  to  replace  important  strings  of  textual  symbols 
by  only  one  or  two  such  conventional  output  symbols.  This  limita¬ 
tion  has  been  taken  into  account  in  the  sentence  selection  for  the 
target  abstracts  by  selecting  only  from  the  pre-edited  document. 

The  result  was  that  if  the  significant  content  of  a  sentence  could  not 
be  output  by  the  computer  it  was  discredited. 

Format 

The  format  of  the  classical  abstract  prepared  by  humans 
comprises  title,  author,  and  a  paragraph  of  connected  text.  The 
present  output  hardware  provides  little  leeway  in  the  composition 


19 


of  the  textual  output.  However,  since  an  automatic  abstract  is,  at 
present,  nothing  more  than  an  automatic  extract,  it  is  desirable  to 
correct  the  generally  disjointed  sequence  of  selected  sentences  by 
other  devices.  This  problem  has  been  partially  solved  by  capturing 
in  an  automatic  abstract  those  informative  features  of  structure 
found  in  section  headings  and  subheadings. 

Dissemination 

Even  though  the  problem  of  dissemination  of  automatic  abstracts 
has  received  little  attention  in  the  literature,  it  nevertheless  will  play 
an  important  part  in  the  general  acceptability  and  utility  of  automatic 
abstracts.  Both  theoretical  and  practical  studies  must  be  made  to 
ascertain  how  the  requester  communicates  with  the  abstracting  sys¬ 
tem,  how  the  system  collates  similar  requests,  and  how  the  system 
produces  multiple  copies  of  the  abstracts  through  a  suitable  medium. 

2.  3.  1  System  Output 

An  abstract  of  a  document  can  be  produced  by  any  combination 
of  the  four  basic  methods  of  calculating  the  sentence  weights  in  a 
document:  Cue,  Key,  Title,  and  Location,  which  are  defined  in 
Section  2.  2  of  this  report.  The  system  output  gives  the  following 
information: 

(1)  Document  number 

(2)  Methods  used  to  produce  abstract 

(3)  Title  of  document 

(4)  Author  of  document 

(5)  Selected  sentences  (All  are  tagged  with  paragraph  and 
sentence  number. ) 

(6)  All  headings  (Headings  are  identified  by  their  0  sentence 
number. ) 


20 


Exhibits  6-11  show  automatic  abstracts  produced  by  the  six 
methods  of  greatest  interest:  C  (Cue),  K  (Key),  T  (Title),  L  (Loca¬ 
tion),  and  the  combined  methods  C-K-T-L,  C-T-L. 

Splitting  of  Abstracts 

Depending  on  the  order  in  which  documents  are  placed  on  the 
Edit  tape,  it  is  possible  that  documents  not  over  3069  words  can  be 
split  by  the  Abstracting  Program  into  two  abstracts.  It  is  also 
possible  that  a  document  over  3069  words  can  be  processed  due  to 
its  arrangement  on  the  Edit  tape  without  a  split  occurring.  These 
may  never  occur  but  the  possibility  of  their  happening  does  exist 
(see  Appendix,  Section  5). 

2.  3.  2  Research  Oitput 

A.  Key  Word  List: 

Two  lists  of  Key  words  are  output.  The  first  list  is  output  in 
frequency  order;  an  example  is  given  in  Exhibit  12.  The  second  list 
is  output  in  alphabetical  order;  an  example  is  given  in  Exhibit  13. 

A  description  of  how  the  Key  words  are  computed  is  given  in  Section 
2.  2.  2  of  this  report. 

B.  Sentence  Weight  List: 

Two  lists  are  output  of  each  sentence  number  in  the  document 
along  with  its  computed  weight.  The  first  list  is  in  sentence  order; 
an  example  is  given  in  Exhibit  14.  The  second  list  is  in  weight 
order;  an  example  is  given  in  Exhibit  15.  The  weight  is  the  sum 
resulting  from  the  Cue,  Key,  Title,  or  Location  methods  used. 

C.  Abstract: 

Defined  in  Section  2.3.  1. 


21 


D.  Vertical  Listing: 

This  is  a  list  of  every  word  as  it  occurred  in  the  document 
along  with: 

(1)  Punctuation  before  (PB):  includes  open  quotation  mark, 
open  parenthesis,  initial  capitalization  (denoted  by  *),  etc. 

(2)  1  St  punctuation  after  (PA  1):  includes  comma,  hyphen, 
period,  etc. 

(3)  2nd  punctuation  after  (PA2):  includes  hyphen,  closed 
parenthesis,  closed  quotation  mark,  etc. 

(4)  Paragraph  number  (P) 

(5)  Sentence  number  within  the  paragraph  (S) 

(6)  Word  number  within  the  sentence  (W) 

(7)  Cue  weight  C  (the  weight  computed  according  to  Section 
2.  2.  1;  the  Cue  weight  is  only  computed  if  switch  1  is 
set  equal  1) 

(8)  Key  weight  K  (the  weight  computed  according  to  Section 
2.  2.  2;  occurs  only  if  switch  3  is  set  equal  1) 

(9)  Title  weight  T  (the  weight  computed  according  to  Section 
2.  2.  3;  output  only  if  switch  2  is  set  equal  1) 


After  the  last  word  of  each  sentence  is  output,  the  Location  weight 
for  the  sentence  is  output  together  with  the  total  Cue,  Key,  and 
Title  weights  for  that  sentence.  The  Location  weight  is  described 
in  Section  2.  2.  4. 

Exhibit  16  shows  a  portion  of  a  vertical  listing  of  Document  6 
from  the  Exotic  Fuel  Corpus. 


22 


PM  tINI 


OOCUMCNT  NgHIfK 


k  PMI 


I 


AISTKACI  tASfO  OH  CUl  MTS. 

|VAiU*flON  OP  1MC  EPPICr  CP  OlMCTHVLAPINf  AOAINf  ANU  tfVEAAl  OTNlR  AOOITIVCS  ON  COMUSTION  STAtlLliy  CNMACTIAIITICt 
OP  VAMOUS  NVOROCmON  TVPI  PUiLS  IN  PHtUIPf  NICAOAUANEII  UOfTTIOI 
•  •  L*  MACI 

I  0  SUHMAHV 

I  I  PAEVIOUS  STUDIES  IN  PHILLIPS  2  INCH  TUASOJET  ENGINE  TYPE  CONIUSTOA  HAD  INDICATED  THAT  SUCH  NATMIALS 

COULO  SUISTANTIALLV  INCREASE  THE  NAIINUN  RATE  OP  HEAT  RELEASE  ATTAlNADLEt  ESPECIALLY  MlTH  LON 
PIRPORNANCE  PUELS  SUCH  AS  THE  ISO  PARAPPIN  TYPE  HYDROCARSONS-PARTICULARLY  WHEN  OPERATING  UNOfG 
SEVERE  CCNOiriCNS  fQR  COMftUSTtON  It.E.  •  HIGH  AIR  PLOW  VELOCITY  OR  LON  CONRUSTION  PUSSUMI. 

4  I  THE  ASSUPPTtCN  HAS  MEN  NADS  IN  THIS  PUEL  EVALUATION  THAT  THE  GREATER  THE  ALLOMAILE  HEAT  INPUT  RATE 

POR  A  GIVEN  VELOCITY*  THE  GREATER  THE  DECREE  OP  CONRUSTION  STARILITV. 

4  S  CN  THIS  BASIS*  THE  DATA  INDICATE  THAT  ALL  THE  AOOITtVE  NATERIALS  TESTED  CAUSED  AN  INCREASE  IN 

STARILITV  PERPORMANCE.*  A  PUEL  OP  RELATIVELY  LOH  PERPORMANCE  SUCH  AS  TOLUENE  REIN6  RENEPITIO  TO  A 
CRCATEA  EXTENT  THAN  A  HIGH  PERPORMANCE  PUEL  SUCH  AS  NORMAL  HEPTANE. 

4  S  IN  GENERAL*  AOOITtVE  CONCENTRATIONS  OP  ONE  PER  CENT  RV  HEIGHT  IN  THE  SEVERAL  PURE  HYOROCARRONS  HNICH 

NORMALLY  OIPPEREO  OUtTC  HtOELY  IN  PERPORMANCE*  PRODUCED  UNtPORMLV  SUPERIOR  CONRUSTION  STARILITV 
CHARACTERISTICS  AS  MEASURED  USING  THE  PHILLIPS  MICRORURNER. 

SO  I.  INTROCUCflCN 

4  1  AT  THE  REQUEST  OP  THE  NAVY  RUREAU  OP  AERONAUTICS  THE  JET  PUELS  GROUP  HAS  EVALUATED  THE  EPPfCTS  OP 

THE  ADDITION  OP  SMALL  AMOUNTS  Of  DIME IHYLAMINEBOR IN  ON  THE  COMBUSTION  STABILITY  PERPORMANCE  OP 

SEVERAL  HYDROCARBON  PUELS. 

R  0  II.  DESCRIPTION  OP  PHILLIPS  MICRORURNER  (MODEL  lA) 

10  0  III.  DESCRIPTION  OP  TEST  APPARATUS 

11  2  IN  THE  PRESENT  EVALUATION  IT  MAS  NOT  NECESSARY  TO  CONSIDER  THE  EPPECT  OP  CORROSION*  CONSEGUENTLV  R 

CONTINUOtS  PLQm  SYSTEM  PROVIDING  GREATER  FLEXIBILITY  AND  EASIER  HANDLING  HAS  INCORPORATED  HHICH 
REQUIRES  ONLY  SLIGHTLY  HQRE  FUEL  PER  TEST  THAN  THE  ORIGINAL. 

14  0  IV.  OESCRIPTtCN  OP  TEST  FUELS 

15  2  THESE  PUELS  REPRESENT  VARIATIONS  IN  CHEMICAL  STRUCTURE  HHlCH  HILL  IN  TURN  PROVIDE  INDICES  OP  ROTH 

GOOD  AND  POOR  COMBUSTION  STAHKITY  PERPORMANCE. 

IT  0  V.  TEST  PROCEDURE 

21  0  VI.  RESULTS 

U  2  THE  RECORDED  DATA  HERE  CONVERTED  TO  VALUES  OP  HEAT  INPUT  RATE  AND  A  AEPERENCE  VElOClTT  AT  PLASHRACR* 

Thus  the  PLASHRACR  limits  IOR  combustion  stability  characteristics)  are  established  ON  THE  RASH  OP 

AN  ALLQHARLE  HEAT  INPUT  RATE. 

I  the  REGICN  OP  STARLE  OPERATION  1$  DEFINED  AS  THE  STATE  OP  PIASHRACRTHE  CONDITIONS  OP  COMBUSTION 

kHlRS  THE  PLANE  HOULO  BECOME  ANCHORED  TO  A  PLANE  HQLOER-AS  IN  STABLE  GAS  TURBINE  OR  RAH  UIT 
COMBUSTOR  OPERATION-IP  THE  PLANE  HOLDER  HERE  PROVIDED  IN  THE  BURNER  TUBE. 

21  0  VII.  OIICUSSIGN 

2S  1  PREVIOUS  NORM  CONOUCTCO  IN  THE  PHILLIPS  2  INCH  COMBUSTOR  IRSP.  2)  INDICATED  THAT  SOME  ADDITIVES 

CAUSED  A  SIGNIFICANT  INCREASE  IN  THE  PERFORMANCE  OP  A  LOv  RATING  FUEL  HH|LE  THESE  SAME  ADDITIVES  DID 
NOT  SUBSTANTIALLY  EPPECT  THE  HIGHER  RATING  FUELS. 

50  S  THE  •BOOST*  fPPECTEO  HlTM  ADDITIVE  IN  TOLUENE  IS  ABOUT  POUR  TIMES  THAT  OP  THE  INCREASE  APPORDID  HITN 

BENIENE*  INDICATING  THE  PMBNOMANA  TO  BE  ONE  OP  GENERAL  PERFORMANCE  LEVEL  RATHER  THAN  A 
CHARACTERISTIC  OP  AROMATIC  TYPE  HYDROCARBONS. 

51  2  EXAMINATION  OP  THESE  CURVES  SHOWS  OIMfTHYLAMINE  BORINE  TO  PROVIDE  THE  GREATEST  INCREASE  IN  STABILITY 

PERPORMANCE  AMO  PROPYLENE  OXlOE  THE  LEAST. 

SI  I  all  pour  additives  indicated  their  ADDITION  TO  BE  SUBJECT  TO  THE  EFFECT  OF  OEMINISHING  RESULTS  UPON 

FURTHER  AOOITION-THRT  IS*  THEIR  EFFECT  HAS  NOT  ESSENTIALLY  A  BLENDING  EFFECT. 

SI  0  Vlll.  CONCLUSIONS 

14  1  1.  THE  ACOITION  OF  OIMETHVLANINE  BORINE  IN  CONCENTRATIONS  OP  ONE  PER  CENT  BY  HEIGHT  TO  JET  PUfL  TYPE 

HTORKARBONS  RESULTED  IN  A  UNIFORMLY  HIGH  LEVEL  OP  COMBUSTION  STABILITY  PERFORMANCE  AS  MEASURED  BY 
PHILLIPS  MlCROeURNER. 

SI  1  2.  the  ACOITION  OP  RELATIVELY  LARGE  AMOUNTS  OP  PROPYLENE  OXIDE  TO  TOLUENE  HERE  NECESSRRV  TO  PROVIDE 

SIMIPICANT  IMPROVEMENT  IN  STRBlLlTV  PERFORMANCE  AS  INOICATEC  BY  INCREASES  IN  ALLONABLC  HEAT  INPUT 

S*  i  S«  the  addition  op  AOOITIVE  CCNCENTRATIONS  (UP  TO  1  PER  CENT)  OP  AMYL  NITRATE*  CUMENE  HVOROPEROIIOI* 

AND  OINETHVLAMINE  lORINE  ALL  RESULTED  IN  IMPROVED  STABILITY  PERPORMANCE. •  THE  GREATEST  INCREASIS 
HERE  SHOHN  WHEN  8LEN0E0  HlTM  A  PUEL  OF  POOR  PERPORMANCE  CHARACTERISTICS-SUCH  AS  TOLUENE. 

SB  2  EENEPlCiAL  EFFECTS  HERE  APPRECIABLY  LESS  WHEN  BLENDED  HtTH  A  FUEL  OP  GOOD  PERFORMANCE 

CHARACTERISTICS-SUCM  AS  N-HEPTANE. 

IT  0  U.  RECOMMENOATIONS 

SR  1  RRSEO  ON  THE  EVALUATION  OP  THE  EFFECTS  OP  ADDITIVES  ON  THE  FLASHBACK  LIMITS  OP  THE  AOOITIVI-PUBl 

ILENOS  TESTED  IN  THE  MICROBURNER  (MODEL  lA)  IT  IS  RECOHMENOEO  THAT  OIMETMYLAMINE  BORING  SHOULD  BE 
PURTNIR  INVESTIGATED. 

•G  I  THIS  FUTURE  DORR  SHOULD  INCLUDE  STUDY  OP  CORBUSTION  STABILITY  ANO  COMBUSTION  SPPICIENCV  EFFECTS  IN 

THE  PHILLIPS  2  INCH  COMBUSTOR  ANO  AN  INVESTIGATION  OP  ITS  INFLUENCE  ON  COMBUSTION  CLEANLINESS. 


Exhibit  6.  Automatic  Abstract  of  Document  6  Produced  by  Cue  Method  C 


lINT 


OOCUMfNT  MUMHR 


*  PM§ 


t 


AisruAcr  vASio  on  aiv  uri. 

iVALUATtON  or  1H|  IMfCT  Cr  OlNITMVLAMtNi  tOAtNl  AND  SIVIAAL  OTMIA  AOOlTIViS  ON  COMUtTlON  STAItllfV  CHMACTIftlSTlCS 
OF  YAKIOUS  NVOAOCAAMN  TVFt  FUlkS  IN  FHKLIFS  NlUODUANH  lAOITTIOI 
R«  L.  MAC! 


0  SUHMAKV 

1  AT  TNI  AIQUCST  OF  THi  NAVV  lUAIAU  OF  AUONAUTICI*  FHIUIFS  FirAOklUN  COMFANV  UNOIATOON  THf 

IVALUATION  OF  OlNItHVLANtNf  lOHlNf  AS  AN  ADDITIVI  FOR  INFROVtNO  THI  COMUSTlON  CNARACTIRISTICS  OF 
AVIATION  CAS  TUNiiNi  TVM  FUHS. 


2 


I 


FRIVIOUS  STUOliS  IN  FHILLIFS  2  INCH  TUADOJCT  iNOINi  TVFf  COMRUSTOA  HAD  INOlCATfO  THAT  SUCH  NATIRIAiS 
COULO  SUSSTANftAUV  INCASASt  tHC  MAXlNUM  RATI  OF  HIAT  RlLSASI  ATTAINAILI*  ISFICIAUV  WITH  LON 
FfRFORNANCI  FUCLS  SUCH  AS  TH|  ISO  FARAFFIN  TVF|  HVOROCARSONS-FARTtCULARLV  NN|N  OFIRATINO  UNOtR 
SIViRI  CONOITIONS  FOR  CONOUSTION  ll.l*  •  M|CN  AIR  FLON  VSLOCITV  OR  LON  COHtUSTION  FRISSURI). 


tHC  ASSUMFTION  MAS  RUN  MAOt  IN  THIS  FU|L  tVALUATION  THAT  THf  ORCATfR  THS  ALLONAILI  HfAT  INFUT 
FOR  A  GlVCN  VfLOCITV*  THC  CRSATtR  THf  DI6RSI  OF  CONSUSTtON  STARILITV 


2 


ON  THIS  RASIS*  THC  DATA  INOICATC  THAT  ALL  THI  ADDITIVI  NATfRIALS  TCSTIO  CAUSfO  AN  INCRIASI  IN 
STARILITV  FCRFORMANCC.*  A  Full  OF  RHATIVCLV  LON  FIRFORMANCC  SUCH  AS  TOLUfNf  RCINO  RINIFIttO  TO  A 
GRCATCR  CKTCNT  THAN  A  HIGH  FCRFORNANCI  FUlL  SUCH  AS  NORMAL  HIFTANC. 


A  NITH  ACSFCCr  TO  ThC  OIMCTHVLAMINC  RORINI*  ITS  IFFICT  AS  A  FUCL  AOOITIVC  MAS  NOTCNORTHV..  0.1  MflOHT 

FIR  CCNT  IN  TOLUCNC  RIINC  CQUlVALINT  TO  20  FIR  CINT  IV  HllOHT  OF  AOOlO  FROFVLINC  OXIOC. 

S  IN  CCNIRALi  AOOITIVI  CONCCNTRATIONS  OF  ONI  FIR  CCNT  RV  NCIMT  IN  TH|  SCVIRAL  FURi  HVOROCARtONS  NHICH 
NORNALLV  OlFFIRCO  QUITC  NIDCLV  IN  FIRFORMANCC*  FRODUCCD  UNIFORMLY  SUFCRIOR  COMRUSTION  STAIUITV 
CHARACTIRISTICS  as  NCASURCO  USING  THI  FHILLIFS  MICRORURNCR* 


0  1.  INTNQOUCTION 


R  0  II.  DCSCRIFTION  OF  FHILLIFS  NICRORURNCR  IMOOCL  U) 

10  0  111.  DCSCRIFTION  QF  TCST  AFFARATUS 


0  tv.  DCSCRIFTION  QF  TCST  FOILS 


0  V.  TCST  FROCIOURC 

0  VI.  RCSULTS 

S  THC  HCAT  INFUT  RATI  IS  OCTCRMINCD  RV  THC  HCATING  VALUl  OF  THC  FOIL  FIR  FOUND  OF  AIR  OR  HIR-LOHtR 

HCATIN6  VALUC  X  (FUCL  FLOH/AIR  FLOM)  FR 

0  VII.  DISCUSSION 

X  tHC  IVALUATION  OF  THC  CFFCCTS  OF  THI  ADOITIVCS  UNOCR  CONSIDCRATION  ON  THC  COMRUSTION  STARILITV 

CHARACTIRISTICS  OF  THC  FuCL  RLCNOS  TISTCD  NlRI  INTIRFRCTCD  FROM  A  FLOT  OF  HCAT  INFUT  RATC  AT 
FLASHRACK  AGAINST  RCFIRlNCI  VCLOCITV. 

2  ThCRCFQRI*  in  ADDITION  TO  THI  IVALUATION  OF  THI  FUfl  RLCNOS  CONTAINING  O.l*  D.S*  AND  I  FIR  CCNT 

OIMTHVLAMINC  RORINC  IN  TOLUCNC  tSHOMN  IN  FIGURC  9)  ThC  SAMI  ADDITIVI  HAS  TISTCD  IN  N^HCFTANC  ISHOMN 
IN  FIGURC  4). 

I  IN  ORDER  TO  COMFARE  THC  ADDITIVES  EVALUATED*  THI  ALIOHARLC  MEAT  INFUT  RATES  OF  THC  FUCL*A001TIVC 

RLCNOS  AT  A  CONSTANT  VCLOCITV  OF  12  FFS  lARRlTRARlLT  SCICCTCD)  HIRC  FLOTTCO  AGAINST  ADDITIVI 
CONCCNTRATION-SMOHN  IN  FIGURf  R. 

I  mention  should  8C  MAOC  of  THC  FACT  THAT  DURING  THE  COMRUSTION  OF  THC  OIMCTMVLAHINC  RORINC- 

nRORQCARION  FUCL  RLCNOS  NO  NOTICCARLC  ODORS  OR  SMORI  MCRC  ORSIRVCD. 

0  VI  It.  CONCLUSIONS 

I  l«  THE  ADDITION  OF  OIMEThvLAMINE  RORINE  IN  CONCENTRATIONS  OF  ONE  FIR  CCNT  RV  HCIGHT  TO  2CT  FUCL  TTFC 

HYOROCARRONS  RESULTED  IN  A  UNIFORMLY  HIGH  LEVEL  OF  COMtUSTION  STAIILITV  F'  'ORMAMCC  AS  MCASURO  RV 
FHILLIFS  MICRORURHER. 

I  3.  THE  AOOITIOH  QF  ADDITIVE  COHCCNTRATIONS  lUF  TO  I  FER  CENT)  OF  AHVl  NITRATE*  CUNCNE  HVOROFCROXIOC* 

AND  OIMETHYLAMINE  RORINC  ALL  MCSULTED  IN  IMFROVED  STARILITV  FERFORMANCC. t  THE  GREATCST  INCRCASCS 
HERE  SHORN  WHEN  RLBNOCO  H|TH  A  FUEL  OF  FQOR  FCRFORMANCE  CHARACTCRISTICS-SUCH  AS  TOLUCNC. 

0  tl«  RCCOMMCNOATIONS 

1  RASED  ON  THE  EVALUATION  OF  THE  EFFECTS  OF  ADDITIVES  ON  THE  FLASHRACR  LIMITS  OF  THC  ADOITIVC-FUCL 

BLENDS  TESTED  IN  THE  MICRORURHER  IMODCL  U>  IT  IS  RECOMMCNOCD  THAT  OIMCTNVLAMIMC  RORINC  SHOULD  Rl 
FURTHER  1NVESTI6ATE0. 


>R  2  this  FUTURE  dURR  SHOULD  INCLUDE  STUDY  OF  COMBUSTION  STABILITY  AND  COMBUSTION  EFFICICNCV  CFFCCTS  IN 

THI  FHILLIFS  2  INCH  COMRUSTOR  AND  AN  INVESTIGATION  OF  ITS  INFLUCNCC  ON  COMRUSTION  CLEANLINCSS* 


Exhibit  7,  Automatic  Abstract  of  Document  6  Produced  by  Key  Method  K 


PM  $MT 


OOCUMINT  NUMIIH 


6 


I 


USTRACf  tAttO  m  TtT|.t  MTS* 

IMAIUATION  0^  THI  ffSICT  Of  OINff NVLANiMt  SOIItMf  ANC  SfVIHAL  OTNIA  AOOITtVkS  ON  COMSUSTION  STASIIITV  CHAAACTWISTICI 
Of  VAAIOUS  HVOAKfUON  TVfS  fUtLt  IM  fHlUlfS  NlCNOtUANlR  lAOITTIOt 
A*  l«  MACI 


SUNNAAT 

AT  ThC  AtOUfeST  Of  THt  NAVY  ftUAtAU  Of  AlACNAUTICS*  AHlLLIfl  fSTAOUUN  CONfANY  UNOfATOOR  THS 
IVACUATION  Of  OIMTHYUNtm  lOAlNt  At  AN  AOOITfVI  fOA  IMfAOVlNO  THC  COMUSTION  CHANACTSAlSTtCS  Of 
AVIATION  OAS  rUAOINI  TTft  fUlUt 

ilCAUSI  Of  THI  SNALl  AMOUNT  1100  OAAHSi  Of  0IMITHYLAN|NC  AOAINI  KICCIVID  fNOM  CALLfRV  CHfMlCAI. 
COMfANT*  THIS  IVALUATtON  MAS  OSIN  IINITIO  TO  THI  NIASUACNkNT  Of  ITS  IfffCT  ON  THI  flASH«OACK 
CHAAACTIAttriCt  Of  TNAII  fUAI  HYOAOCAAICNI  ITOLUlNIt  NOANAi  HlfTANI  AND  AINZINI)  IN  THI  fNIUtfS 
MICAQIUANIA* 

fAlVlOUS  STUOIIS  IN  fHtUtft  t  INCH  TUAIOJIT  IN6INI  TYfl  COMiUSTOA  HAD  INOICATCO  THAT  SUCH  NATIAIAIS 
COUIO  SUOSTANTIALLY  INCAIASI  THI  MAXIMUM  AATI  Of  H|AT  ASLIASI  ATTAINAILI*  ISflClAUV  HITH  LON 
flAfOAMANCI  fUlLS  SUCH  AS  TH|  ISO  fAMAfflN  TVff  HYOAOCAAIONS-f AATICULAALY  MHIN  OflAATINO  UNOfA 
SIYIAI  CONDITIONS  fOA  COMOUSTION  iUI*  *  HIGH  A|A  fLOH  VfLKlTY  OA  LON  COMOUSTION  fAltSUAD* 

IN  OINIAAL*  AOOirtVI  CONCiNTAATIONS  Of  CNI  flA  CINT  AY  HIXGHT  IN  THi  SIVIAAL  fUAI  MYOAOCAAIONS  MHICH 
NOANAILT  OtfflAIO  OUtTI  MiOUT  IN  flAfOAMANCI*  fAODUCIO  UNlfOAMLY  SUffAlOA  COMOUSTION  STABILITY 
CMAAACTIAISriCS  AS  MIASUAtO  USING  THI  fHlLLlfS  MtCAOAUANIA* 

t*  INTAOOUCriON 

AT  THI  AIQUIST  Of  THI  NAVY  GUAIAU  Of  AlACNAUTICS  ThI  JIT  fUlLS  GAOUf  HAS  IVALUATID  THI  ifflCTS  Of 
THI  AOOITION  Of  SMALL  AMOUNTS  Of  0 IMfTHTLAMINlOOAtN  m%  ON  THI  COMAUSTION  STAIILITY  flAfOAMANCI  Of 
SIVIAAL  HVOAKAAAON  fUlLS* 

OUl  TO  Tha  SMAU  GOANTITY  Of  THU  MATIAIAL  OOTAINIO  THI  IVALIIATION  HAS  CONDUCTID  IN  TH|  fHlLLlfS 
MtCAOOUANIA  INODIL  lAI  MHICH  IS  A  SLIGHTLY  NODtftiD  VlASION  Of  TMl  OAIGINA*.  fHlLLIft  MICAOIUANIA 
IMOOIL  II* 

It*  OISCAlfTION  Of  fHlLLlfS  MiCAOiUANIA  IMODIL  lA) 

III*  OISCAlfTION  Of  TiST  AffAAATUS 


lA  0  IV*  OISCAlfTION  Of  TIST  fUlLS 

lA  1  THC  AOOITIVCS  CVALUATCO*  OIMCTHVLAMINI  SQAINl*  fAOfVLlNC  OXIDI*  AMYL  NITAATI*  AND  CUMIN! 

HVOAOfAAOXtOC*  MtAI  ILINOIO  INTO  ThI  HVDAOCAAION  fUlL  AY  MllGHT  IN  CONCINTAATIONS  AANGtNG  fAOM  0*1 
TO  20  flA  CINT* 

IT  0  V*  TIST  fAOCCOUAI 

21  0  VI*  AISULTS 

2S  0  VII*  DISCUSSION 

2A  I  THC  CVALUATION  Of  THI  fffCCTS  Of  THI  AODITIVIS  UNOfA  CONStOlNATtON  CN  ThC  COMOUSTION  STAIILITY 

CHAAACriAISTlCS  Of  THC  fUlL  ILINOS  TISTID  MlAC  INTlAfACTID  fAOM  A  flOT  Of  HCAT  INfUT  AATk  AT 
fLASHOACA  AGAINST  ACfCAlNCI  VILWITV* 

IT  I  fACVlOUS  HOAX  CONOUCTCO  IN  THI  fHiLllfS  2  INCH  COMIUSTOA  lACf*  21  INDICATID  THAT  SOM!  AOOITIVCS 

CAUSIO  A  SlGNIflCANT  INCAIASI  IN  THI  flAfOAMANCI  Of  A  LOM  AATINC  fUCL  MHiLC  THESE  SAME  ADDITIVES  DID 
NOT  SUISTANTIALLY  CffECT  THE  HIGHER  AATINC  fUlLS* 

50  I  TO  MOAI  OCflNITCLY  ESTADLISH  THC  CffKT  Of  DIMfTMYLAMlNl  lOAINi  ON  THC  fLASHOACA  LIMITS  Of  AAOMATIC 

TVff  HVOAQCAAION  fUCLS*  lINZENf  MITH  O.IA  HEIGHT  fEA  CENT  ADDITIVE  HAS  CVALUATCO  AS  SHOMN  IN  flGUAC 

I* 

51  S  ALL  FOUR  ADDITIVES  INDICATED  ThEIA  ADDITION  TO  IE  SUOJECT  TO  THE  EffECT  Of  DCNINISHlNG  RESULTS  UfON 

fUATHCA  AOOtriQM-THAT  IS*  THEIA  CffECT  HAS  NOT  ESSENTIALLY  A  ILEWINC  EffCCT* 

52  1  MCNTION  SHOULD  DC  MADE  Of  THE  FACT  THAT  DURING  TMf  COMOUSTION  Of  THE  DIMCTHVLAHINE  MAINE- 

HYOAOCAAOON  fuel  ILCNOS  no  NOriCCAOLE  ODORS  OA  SMOaE  MERE  OISEAYED. 

SI  0  YIII*  CONCLUSIONS 

14  1  1*  THE  AOOITION  Of  DIMCTHVLANINE  lOAINC  IN  CONCENTAATIONS  Of  ONE  fEA  CENT  IV  HEIGHT  TO  JET  fUEL  TVfE 

MYOAOCAAIONS  AISULTCO  IN  A  UNIfOANLV  HIGH  LEVEL  Of  CONIUSTION  STAIILITY  FEAfOAMANCC  AS  MEASURED  IV 
FHlLLlfS  MICAOlUMNCA* 

SI  I  S*  THC  ADDITION  Of  ADDITIVE  CONCCNTMATICNS  lUf  TO  I  FEA  CCNTl  Of  RHYL  NITRATE*  CUNENC  HVOAOfEAOXtOC* 

AND  OIMCTMVLAMINC  IQAINC  ALL  RESULTED  IN  INfAOVCD  STAIILITT  fCAFONMANCE. *  THE  GREATEST  INCREASES 

HERE  SHOHN  HHCN  ILCNOCO  HITH  A  FUEL  Of  fCOA  flAfOAMANCE  CNAAACTEMt STICS-SUCN  AS  TOLUENE. 

ST  0  IX*  AKOMMCNOATtONS 

SI  1  lASEO  ON  THC  CVALUATION  Of  THE  CffICTS  Of  ADDITIVES  ON  THE  PLASHlACR  LINITS  Of  THE  AODITiVE-fUEL 

ILINOS  TESTED  IN  THE  NICAOlUANCM  (MODEL  U>  IT  IS  ACCOMMENDCO  THAT  OIMCTHYLAHINE  lOAtNC  SHOULD  IE 
FUATHCA  IMVESTIGATED* 

II  2  this  fUTUMf  HOAK  SNQUIO  INCLUDE  STUDY  Df  COMOUSTION  STABILITY  AND  COMIUSTION  CfftCICNCV  CffCCTS  IN 

the  FHlLLlfS  2  INCH  COMIUSTOA  AND  AN  INVESTIGATION  Of  ITS  INFLUENCE  ON  COMIUSTION  CLEANLINESS* 


Exhibit  8.  Automatic  Abstract  of  Document  6  Produced  by  Title  Method  T 


SINT 


DQCUMINT  NUHIIR 


*  R*6f 


I 


ASSTRACT  AASID  ON  kOC.  MTS. 

(VALUATION  OA  TM|  EFfCCf  CA  OlMfiTHYLANtNl  AOAlNi  ANO  SIVIHAL  QTHIA  ADOtTIVfS  ON  COHRUSTinN  STAIUITV  CHAAACTiAISTtCS 
OP  VARIOUS  HVCROCARRON  TYPI  PUIS  IN  PHILLIPS  MICROAURNIR  lAOITTSO) 

R.  L.  IRACi 


1  0  SUHNAHY 

2  1  AT  THE  AfOUlST  OP  THE  NAVY  AURIAU  QP  AIRONAUTICSi  PHILLIPS  PETROLEUM  COMPANY  UNDERTOOK  THE 

EVALUATION  OP  Cl  PE THYLAMI NE  ttORINE  AS  AN  AUDITIVE  POR  IMPROVING  THE  COMRUSTION  CHARACTERISTICS  OP 
AVIATION  CAS  TUKRiNE  TYPE  PUELS* 

2  2  EECAUSt  OP  THE  SMALL  AMOUNT  1100  UMAMSI  OP  UIMETHVLAMINE  lORlNE  RECEIVED  PROM  CALLCRV  CHEMICAL 

coMpANVi  This  Evaluation  has  keen  limited  to  the  measurement  op  its  eppict  on  the  plash-eack 
CHARACTERISTICS  OP  THREE  PURE  HVOKOCAMBONS  ITOLUENE*  NORMAL  HEPTANE  AND  BENZENE)  IN  THE  PHILLIPS 
PICRQEURNIR. 

2  S  UIMETHYLAMINE  RORINE  CONCENTRATIONS  OP  PROM  0.1  TO  l.O  PER  CENT  BY  HEIGHT  MERE  EVALUATED. 

3  1  PUR  COHPARATIve  PURPOSES  ThO  COMMON  IGNITION  ADDITIVES  lAMVL  NITRATE  AND  CUMENE  HVOROPEROXlOE)  HIRE 

ALSO  EVALUATED  DURING  THIS  STUDY.  A$  HELL  AS  CONCENTRATIONS  UP  TO  20  PER  CENT  BY  HEIGMT  OP  PROPVLfMB 
ORIOE'A  RELATIVELY  HIGH  PLAME  VELOCITY  PUEL. 

3  2  PREVIUUS  STUCIES  IN  PHILLIPS  2  INCH  TURBOJET  ENGINE  TYPE  COMBUSTOR  HAO  INUICATEO  THAT  SUCH  MATERIALS 

CQULO  SUBSTANTIALLY  INCREASE  THE  MAXIMUM  RATI  OP  HEAT  RELEASE  ATTAINABLE.  ESPECIALLY  HlTH  LOM 
PERPORMANCE  PuELS  SUCH  AS  THfc  ISO  PARAPPIN  TYPE  HYDROCARBONS-PARTICULARLY  WHEN  OPERATING  UNDER 
SEVERE  CCNOITIQNS  POM  CONBUSTtON  II.S.  •  HIGH  AIR  PLOH  VELOCITY  OR  LOM  COMBUSTION  PRESSURE). 

A  1  THE  ASSUMPTION  HAS  BEEN  MACE  IN  THIS  PUEL  EVALUATION  THAT  THE  GREATER  THE  RLLOHABLS  MEAT  INPUT  RATI 

POR  A  GIVEN  VELOCITY.  THE  GREATER  THE  DECREE  OP  COMBUSTION  STABILITY. 

A  2  CN  THIS  BASIS.  THE  DATA  INDICATE  THAI  ALL  1H|  ADDITIVE  MATERIALS  TESTED  CAUSED  AN  INCREASE  IN 

STABILITY  PERPORMANCE*.  A  PuCL  OP  RELATIVELY  lOh  PERPORMANCE  SUCH  AS  TOLUENE  BEING  BINEPITiO  TO  A 
GREATER  EXTENT  THAN  A  HIGH  PERPORMANCE  PUEL  SuCH  AS  NORMAL  HEPTANE. 

A  3  THESE  DATA  ARt  IN  AGREEMENT  «ITH  PREVIOUS  ADDITIVE  STUDIES  BY  PHILLIPS. 

A  A  klTH  respect  TO  THE  OIMEThYLAMINE  BORINE.  ITS  IPPECT  AS  A  PUEL  ADDITIVE  HAS  NOTEhORTHV.,  O.l  HEIGHT 

PER  CENT  IN  TOLUENE  BEING  EQUIVALENT  TO  20  PER  CENT  BY  HEIGHT  OP  ADDED  PROPYLENE  OXIDE. 


A  S  IN  general,  auditive  concentrations  of  ONI  PER  CENT  BY  HEIGHT  IN  THE  SEVERAL  PURE  HYDRDCARGONS  HHICN 

NORMALLY  OlfPERkO  QUITE  HlDlLY  IN  PERPORMANCE*  PRODUCED  UNIPORHLY  SUPERIOR  COMBUSTION  STABKtTV 
CHARACTERISTICS  AS  MEASURED  USING  ThE  PHILLIPS  MtCROBURNlR. 

$  0  I.  INTRODUCTION 

B  0  II.  DESCRIPTION  OP  PHILLIPS  MICROBURNtH  IMODEL  lA) 

10  0  III.  DESCRIPTION  OP  TEST  APPARATUS 

lA  0  IV.  OfiSCRIPTlON  OP  TEST  PuELS 

IT  0  V.  TEST  PROCEDURE 

21  0  VI.  RESULTS 

2»  0  Vll.  DISCUSSION 


33  0  Vlll.  CONCLUSIONS 

3A  I  l«  THE  AOOiriUN  OP  OIMEThYLAM|NE  BORINE  IN  CONCENTRATIONS  OP  ONE  PER  CENT  BY  HEIGHT  TO  JIT  PUEL  TYPE 

hydrocarbons  resulted  in  a  uniformly  high  LEVEL  OP  COMBUSTION  STABILITY  PERPORMANCB  AS  MfASURfD  BY 
PHILLIPS  MICROMURNER. 

35  I  2.  THE  AOOITION  OP  RELATIVELY  LARGE  AMOUNTS  OP  PROPYLENE  OXIDE  TO  TOLUENE  HERE  NECESSARY  TO  PROVIDE 

SIGNIPICANT  IMPROVEMENT  IN  STABILITY  PERFORMANCE  AS  INDICATED  BY  INCREASES  IN  ALLOHABLE  HEAT  INPUT 
RATES. 

36  i  3,  The  AOOITION  OP  ADDITIVE  CCNCENTHAT IONS  lUP  TO  1  PER  CENT)  OP  AMYL  NITRATE.  CUMENE  HVDROPIROXlDSf 

ANO  OIMETHYLAMINE  BQRINE  ALL  RESULTED  IN  IMPROVED  STABILITY  PERPORMANCE..  THE  GREATEST  INCREASES 
HERE  SHORN  HHEN  BLENOEO  HITH  A  FUEL  OP  PUOR  PERPORMANCE  CHAR ACTERI ST IC$*$UCH  AS  TOLUENE. 

36  2  BENEFICIAL  CPPECTS  HERE  APPRECIABLY  LESS  HHEN  BLENDED  HITM  A  PUEL  OP  GOOD  PERPORMANCE 

CHARACTERISTICS'SUCH  AS  N^MEPTANE. 


37  0  IX.  RECOPMENCATIONS 

36  1  BASED  ON  THE  EVALUATION  OP  ThE  EFFECTS  Of  ADDITIVES  ON  THE  FLASHBACK  LIMITS  OP  THE  ADDITIVE-PUEL 


BLENDS  TESTEU  IN  THE  MICROBURNER  IMODEL  lA)  IT  IS  RECOMMENDED  THAT  OIMETHYLAMINE  BORINE  SHOULD  BE 
FURTHER  INVESTIGATED. 

2  THIS  FUTURE  hORR  SHOULD  INCLUDE  STUDY  OP  COMBUSTION  STABILITY  ANO  COMBUSTION  EFFICIENCY  EPPECTS  IN 

THE  PHILLIPS  2  INCH  COMBUSTOR  AND  AN  INVESTIGATION  OP  ITS  INFLUENCE  ON  COMBUSTION  CLEANLINfSS. 


Exhibit  9.  Automatic  Abstract  of  Document  6  Produced  by 
Location  Method  L 


SCNT  OOCUMINT  NUMIIft  «  PA6f  I 

ABSTRACT  BASfO  ON  CUf  RfV  TITLf  LOC.  NTS. 

fVALUATION  OR  iHf  IRRICT  QR  OIMITHYLAMINI  tOfllM  AND  SfViRAL  OTHIR  AOOITIVIS  ON  COMBUSTION  STABKITV  CMABACTfBISTICB 
OF  VARIOUS  MVCROCARBON  TVRE  RUELS  IN  RNUlIRS  MlCROBURNtR  (AOBTTSOI 
R,  1.  BRACE 


1  0  SUMMARY 

2  \  AT  THE  RIOUCST  OR  THE  NAVY  BUREAU  OR  AERONAUTICSi  RHILLIRS  RETROIEUM  CONRANV  UNOBRTOOK  THE 

EVALUATION  OR  OIHETMVLAMINE  MORINI  AS  AN  ADDITIVE  ROB  IMRBOVINO  THE  COMBUSTION  CHABACTIBISTICS  OR 
AVIATION  GAS  TURBINE  TVRE  RUELS. 

2  2  BECAUSE  OR  THE  SMALL  AMOUNT  1100  GRAMS)  OF  OIMETHVLAMINE  lORINE  RECEIVED  FROM  CALLERV  CHEMICAL 

COMRANV,  THIS  EVALUATION  HAS  BEEN  LIMITED  TO  THE  MEASUBIMENT  OR  ITS  ERRECT  ON  THE  RkASH-IACR 
CHARACTERISTICS  OF  THREE  RUBE  HYDROCARBONS  ITOLUENE*  NORMAL  HERTANE  ANO  BENIENE)  IN  THE  RHILLIRS 
MICRORURMR. 

S  2  RREVIOUS  STUOIES  IN  RHILLIRS  2  INCH  TURBOJET  ENGINE  TYRE  COMBUSTOR  MAD  INDICATED  THAT  SUCH  MATERIALS 

COULD  SUBSTANTIALLY  INCREASE  THE  MAXIMUM  RATE  OF  HEAT  RELEASE  ATTAINABLE*  ESRECIALLV  MITM  LOH 
RERRQRMANCE  FUELS  SUCH  AS  THE  ISO  RARARFIN  TYRE  HVDROCARSONS-RARTICULARLV  MHEN  ORERATINC  UNOtt 
SEVERE  CONDITIONS  FOR  COMBUSTION  It.E.  i  HIGH  AIR  FLOW  VELOCITY  OR  LOM  COMBUSTION  RRESSUREI* 

A  1  THE  ASSURRTtCN  HAS  BEEN  MADE  IN  THIS  FUEL  EVALUATION  THAT  THE  GREATER  THE  ALLOWABLE  HEAT  INRUT  RATE 

FOR  A  GIVEN  VELOCITY*  THE  GREATER  THE  DEGREE  OR  COMBUSTION  STABILITY. 

A  2  CN  THIS  BASIS*  THE  DATA  INDICATE  THAT  ALL  THE  ADDITIVE  MATERIALS  TESTEO  CAUSED  AN  INCREASE  IN 

stability  RCRRORMANCE.*  a  fuel  or  relatively  low  RSRRORMANCE  such  as  toluene  MING  BENERITEO  TO  A 
GREATER  EXTENT  THAN  A  HIGH  RERRORMANCE  FUEL  SUCH  AS  NORMAL  HERTANE. 

A  A  hlTH  RESRECT  TO  THE  OIMETHYLAMINI  BORlNE*  ITS  EFFECT  AS  A  FUEL  ADDITIVE  WAS  NOTEWORTHY**  0*t  WEIGHT 

RER  CENT  IN  TOLUENE  BRING  EQUIVALENT  TO  20  RER  CENT  BY  WEIGHT  OR  ADDED  RRORYLENE  OXIDE. 

A  S  IN  GENERAL*  AOOITIVE  CONCENTRATIONS  OR  ONE  RER  CENT  BY  WEIGHT  IN  THE  SEVERAL  RUBE  HYOROCARMNS  WHICH 

NORMALLY  OIRREREO  QUITE  WIDELY  IN  RERRORMANCE*  RRODUCID  UNIRORNIV  SURERIOR  COMBUSTION  STABILITY 
characteristics  as  measured  using  The  RHILLIRS  MICROBURNER. 

SOI.  INTMQCUCriQN 


A  1 


T  I 

B  0 


AT  THE  REQUEST  OF  THE  NAVY  BUREAU  OR  AERONAUTICS  THE  JET  FUELS  GROUR  NAS  EVALUATED  THE  ERRICTB  OR 
THE  ADDITION  OR  SMALL  AMOUNTS  OR  OIMETHYLAMINEBORIN  •S  ON  THE  COMBUSTION  STABILITY  RERRORNANCI  OR 
SEVERAL  HYDROCARBON  RUELS. 

CUE  TO  THE  SMALL  QUANTITY  OR  THIS  MATERIAL  OBTAINED  THE  EVALUATION  HAS  CONDUCTED  IN  THE  RHILLIRS 
MICKOBURNER  (MODEL  lA)  WHICH  IS  A  SLIGHTLY  MODIFIED  VERSION  OF  THE  ORIGINAL  RHILLIRS  NICRODUBNBR 
(MODEL  U. 

II.  DfiSCRIRTION  OF  RHILLIRS  HICROBURNER  (MODEL  lA) 


10  0  111.  OeSCRlRTlQN  OR  TEST  ARRARATUS 


lA  0  IV.  UESCRIRTION  OR  TEST  FUELS 


IT  0  V.  TEST  RROCCOURE 


21  0  V^«  RESULTS 

2S  0  VII.  DISCUSSION 


2Y 


RREVIOUS  WORK  CONOUCTEO  IN  THE  RHILLIRS  2  INCH  COMBUSTOR  IRER.  2)  INDICATED  THAT  SOHI  ADDITIVft 
CAUSED  A  SIGNIRICANT  INCREASE  IN  THE  RERRORMANCE  OR  A  LOW  RATING  FUEL  WHILE  THESE  SANE  ADOtTIVES  DID 
NOT  substantially  ERRECT  THE  HIGHER  RATING  RUELS. 


3)  0  Vlll.  CONCLUSIONS 

34  I  1,  THE  AOOITION  OR  OlHETHVLANINC  BORlNE  IN  COPVCENTRAT IONS  OR  ONE  RfR  CENT  BY  HIIGHT  TO  JIT  DHIL  TVD| 

HVOROCARBONS  RESULTED  IN  A  UNIFORMLY  HIGH  LEVEL  OR  COMBUSTION  STABILITY  RERRORMANCB  AS  HIASUB0>  BY 
RHILLIRS  MlCROeuRNER. 

35  I  2,  THE  ACOITION  OR  RELATIVELY  LARGE  AMOUNTS  OR  RRORYLENE  OXIDE  TO  TOLUENE  WERE  NICESSARV  TO  RROVlOB 

SIGNIFICANT  IMRROVEMENT  IN  STABILITY  RERFORMANCI  AS  INDICATED  BY  INCREASES  IN  ALLOWABLE  HEAT  INRUT 
RATES. 

34  1  3,  THE  ACOITION  OF  AOOITIVE  CONCENTRATIONS  (UR  TO  I  RER  CENT)  OR  AMYL  NITRATE*  CUHIMf  MVORORIROXIDI* 

ANO  OIMETHVLAMINE  BORlNE  ALL  RESULTED  IN  IMPROVED  STABILITY  RERRORMANCE. *  THE  GREATEST  INCRIAtlS 
WIRE  SHOWN  WHEN  6LEN0E0  WITH  A  FUEL  OF  ROOR  RERRORMANCE  CHARACTERIST ICS-SUCM  A$  TOLUENE. 


ST  0 

RAR  SENT 
SB  1 


IX.  RECOMMCNOATIONS 


DOCUMENT  NUMBER 


B  RAGE 


BASED  ON  THE  EVALUATION  OF  THE  EFFECTS  OF  AOOITIVIS  ON  THE 

BLINDS  TESTEO  IN  THE  MICROBURNIR  INODIl  IA>  IT  IS  RItOMNINOlO  THAT  OINETHVLANINS  BORING  BHOUiO  M 
FURTHER  INVESTIGATED* 

THIS  FUTURE  WORK  SHOULD  INCLUDE  STUDY  OF  COMBUSTION  STABILITY  *** 

THE  RHILLIRS  2  INCH  COMBUSTOR  AND  AN  INVESTIGATION  OR  ITS  INFLUENCE  ON  COMBUSTION  CLiANLlNiSS* 


Exhibit  10.  Automatic  Abstract  of  Document  6  Produced  by 


Combined  Method  C-K-T-L 


SIN1 


OOCUNiMT  HUNMR 


« 


1 


IVUMTION 
»  VMIOUS 
K«  l«  MMl 


10 

u 

IT 

21 

21 

2f 


SI 

SO 


ST 

St 


USTMCT  MStO  ON  CUI  flTlI  IMt  MTI* 

00  THI  lOOICr  00  OINtTMVkAMlNf  tOOINt  MO  UVltU  OTMM  tOOITIVft  ON  CONtUtTION  ITAtllirV  CHAUCTtKlSriCS 
HVOtOCAttON  rvoi  OUllt  IN  ONKlIOt  NICAQIUNNIA  UOtTTSO) 


AT  TNI  tlQUIST  00  THI  NAVY  OMtAU  00  AlAONAUTICti  ONIkLIOS  OfTAOlfUN  CONOANV  UNDIATOOK  THf 
IVALUATION  00  01NiTMn.AN|NI  ftOtlNI  At  AN  ADOITtVI  OON  INOROVINO  THf  CONtUtTION  CHARACTfRlSTlCt  00 
AVIATION  OAS  TURtlNt  TVOI  OUtlS* 

ttCAUSI  00  TNt  INALI  AMOUNT  IlOO  ORAHtl  00  DIMITHVlANINt  tORlNI  RICilVID  OROM  CAllfRV  CNfMtCAL 
CONOANV*  THIS  IVALUATION  HAS  ttIN  LINITIO  TO  THI  OlAtURlNiNT  00  ITS  iOOICT  ON  THI  OlAtM*tACK 
CNARACTIRISTICI  00  TNRII  OURI  MVOROCARtCNt  ITOLUINI*  NORNAi  HIOTANI  ANO  tINtINI)  IN  TNt  ONIUIOS 
MICROtURNIR. 

ORIVIOUt  tTUOtIt  IN  OHtUlOt  2  INCH  TURtOJiT  (NGINI  TVOI  CONtUSTOR  NAO  INOICATIO  THAT  SUCH  NATIRIALS 
COULO  tUttTANTIAUV  tNCRIASI  TNI  NASINUH  RATI  00  HfAT  RKIASR  ATTAINAtLI*  ItOIClALLV  MITH  ION 
OIROORMANCI  OUIIS  SUCH  At  THI  ttO  OARAOOIN  TVOI  HVDROCMtONI-OARTICULARlV  MHIN  OOlRATtNG  UNOCR 
SIVtRi  CONDITIONS  OOR  CONtUtTION  It*!*  *  HIGH  AIR  OLOM  VILKITV  OR  LON  CONtUtTION  ORIttURI). 

THI  AttUNOTION  NAS  tllN  NADI  IN  THIS  OUIL  IVALUATION  THAT  THI  GRIATIR  THI  ALLOHAtLI  H|AT  INOUT  RATI 
OOR  A  GIVIN  VILOCITV*  ThI  GRIATIR  THI  OIGRII  QO  CONtUtTION  STAGILITV. 

CN  THIt  tAtIt*  THI  DATA  tNOtCATI  THAT  ALL  TNI  AODITtVt  NATIRIALt  TItTIO  CAUtIO  AN  INCRIAtf  IN 
tTAtlLlTV  OIROORNANCI**  A  OUIL  00  RIUTtVlLV  LOH  OIROORMANCI  SUCH  At  TOLUINI  MING  tINIOITCO  TO  A 
GRIATIR  IXTINT  THAN  A  NIGH  OIROORMANCI  OUK  SUCH  At  NORMAL  HIOTANI* 

IN  GINIRAL*  AOOITIVI  CONCINTRATlONt  00  ONI  0|R  CINT  IV  MltGNT  IN  THI  tlVIRAL  OURI  HVOROCARtONt  HNICH 
NORNALLV  OtOOIRIO  GUITI  NtOILV  IN  OIROORMANCI*  ORODUCID  UNtOORMLV  tUNRIOR  CONtUtTION  tTAtILITV 
CMARACTlRItTICt  At  NIAtURIO  UttNt  THI  ONlLLlOt  NtCROtURNIR* 

I.  INTRQOUCTfON 

AT  TNI  RIGUIST  00  THI  NAVV  tURIAU  00  URONAUTICt  THI  JIT  OMLt  GROUO  HAt  IVALUATIO  THi  lOOiCTt  00 
THI  AOOITtON  00  SMALL  AMOUNTS  00  OINITNTLANINCSORIN  -t  ON  THI  CONtUtTION  STAGILITV  OIROORNANCI  00 
tlVIRAL  NVOROCARtON  OulLt* 

OM  TO  THI  SMALL  GUANTITV  00  THIS  NATMIAL  OGTAINID  TNI  IVALUATION  HAt  CONOUCTIO  IN  THI  OHlLLlOt 
MICROtURNIR  INOOIL  tA|  MHICH  It  A  UIGHTLV  HOOtOIID  VIRttON  00  THi  ORIGINAL  OHlLLlOt  MICROtURNIR 
IMOOIL  II* 

II*  OltCRIOTIQN  00  OHlLLlOt  MICROtURNIR  INOOIL  U) 

III*  OltCRIOTIQN  00  TltT  AOOARATUt 
IV*  OltCRlOTlON  00  TltT  OUlLt 
V*  TltT  OROCIOURI 
VI*  RISULTS 
Vlt.  OltCUtttON 

ORIVIOUt  HORR  CONOUCTIO  IN  THI  OHlLLlOt  |  INCH  CONtUSTOR  IRIO*  21  INOICATIO  THAT  tOHI  AOOITiVit 
CAUtiO  A  tIGHIOICANT  INCRIAtl  IN  THi  OIROORNANCI  00  A  LOH  RATINC  OU|L  HHlLi  THItf  tAN|  AODITiVit  DID 
NOT  tUtSTAMTIALLV  IOOICT  TN|  NIGMIR  RATING  OUlLt. 

ALL  OOUR  AOOITIVIS  INOICATIO  TNIIR  AOOITION  TO  M  SUGJICT  TO  THI  IOOICT  00  OtNINItHiNG  RISULTS  UOQN 
OURTNIR  AOOITIOI-THAT  It.  THCIR  fOOCCT  HAS  NOT  IttlNTIALLT  A  GLIWING  IOOICT* 

VIII.  CONCLUSIONS 

I.  THE  AOOITION  00  OlNf THVLAHINE  GORINE  IN  CONCINTRATlONt  00  ONE  OER  CENT  IV  HEIGHT  TO  JET  OUIL  TVOI 
NVOROCARIONt  RItULTID  IN  A  UHIOORHLV  NIGH  LEVEL  00  CONGUtTlOH  tTAtILITV  OIROORNANCI  AS  HEASURID  tV 
ONlLLlOt  MICROtURNIR* 

2*  TMf  AODiriON  00  RELATlVfLV  LARGE  AMOUNTS  OF  OROOVLENE  OXIDE  TO  TOLUENE  HERE  NECItSARV  TO  OROVIM 
SIGNIOICAMT  IMOROVEMfNT  IN  STAGILITV  OIROORNANCI  AS  INDICATED  tV  INCRIAtIt  IN  ALLOHAtLI  HEAT  INOUT 
RATES. 

S*  TMf  AOOITION  00  AOOITIVI  CONCENTRATIONS  tUO  TO  I  OIR  CINTI  00  ANVL  NITRATE*  CUMENE  HVOROOIROXlOE* 
ANO  OINITHVLAHINE  tORINI  ALL  RftULTfO  IN  INOROVIO  tTAtILITV  OIROORNANCI* »  THE  GREATEST  INCREAStt 
HfRf  SHOHN  HHfN  ILINOIO  HlTN  A  OUIL  00  OOOR  OIROORNANCI  CHARACTIRISTICt-SUCH  At  TOLUENE* 

II.  RCCONMfNOATIONt 

lAtfO  ON  THE  EVALUATION  00  THE  BOOKTS  00  ADDITIVES  ON  THE  OLAtHMCR  LIMITS  OO  THE  AOOITIVI-OUIL 
ILINDS  TESTED  IN  THE  NiCROtUHNER  IHOOIl  lAI  IT  It  RBCDNHENOfO  THAT  OINfTHTLANINI  lORINf  SHOULD  IE 
OURTNIR  INVfSTItATIO* 

THIS  OUTURf  HORR  SHOULD  INCLUDE  STUDY  DO  CONtUtTION  tTAtILITV  ANO  CONtUtTION  lOOICIINCV  tOOfCTS  IN 
THI  OHlLLlOt  2  INCH  CONtUSTOR  AND  AN  INVESTIGATION  00  ITS  INOLUINCI  ON  CONtUtTION  CLlANllNfSt* 


Exhibit  11.  Automatic  Abstract  of  Document  6  Produced  by 
Combined  Method  C-T-L 


Ul 

o 

< 

CL 


€C 

tu 

CD 

X 


i 


X 

3 

O 

o 

o 


I 

I 

I 

I 

L 


10 

o 

ec 

o 


X 

o 


CO  <o  lo  ( 

a  ^  r-*  ^  • 


M 

ec 

Z  X 

Ui 

o  >  < 

z 

UJ  </> 

>CL  M  > 

3  UJ 

«/)  M  M  _J  UJ  X 

CD  Z 

3  K  M  z  H 

V)  o  UJ 

^  d)  M  ^  CO  UJ 

H  ^  K  cr  O 

K 

UJXO>^<XX«ZUI<0.^ 

UJ 

SCSQXHOmm 

Ui  3  UJ  M  O 

H 

u.o<o.<oao<uu.xx^ 

Exhibit  12.  Keywords  of  Document  6  Ordered  by  Decreasing  Weights  Assigned  by  Key  Method  K 


L 

t 

I 

I 

[ 

[ 


Exhibit  13u  Keywords  of  Document  6  in  Alphabetic  Order  as  Produced  by  Key  Method  K 


1 

! 

( 

PARAGRAPH 

SGNTCNCC 

HEIGHT  OOCUNENT  NUMBER  A  PAGE  1 

1 

0 

1100 

1 

1 

203 

2 

2 

21S 

2 

3 

139 

3 

1 

1A9 

3 

2 

329 

A 

1 

200 

i 

A 

2 

290 

i 

A 

3 

130 

1 

A 

A 

122 

A 

S 

21A 

3 

0 

1090 

I 

A 

1 

192 

i 

A 

2 

100 

1. 

7 

1 

192 

S 

0 

lOAO 

9 

1 

lOA 

j 

9 

2 

TA 

I 

9 

3 

90 

1. 

10 

0 

1030 

11 

1 

129 

11 

2 

177 

11 

3 

AT 

12 

1 

AT 

13 

1 

AO 

- 

lA 

0 

lOAO 

IS 

1 

99 

IS 

2 

lAO 

j 

lA 

1 

100 

i 

IT 

0 

1000 

(  . 

IS 

1 

90 

19 

1 

38 

19 

2 

20 

19 

3 

10 

19 

A 

19 

20 

1 

9A 

21 

0 

lOAO 

22 

1 

99 

22 

2 

121 

22 

3 

AT 

23 

1 

AT 

2A 

1 

117 

29 

0 

lOAO 

2A 

1 

191 

2A 

2 

98 

27 

1 

100 

27 

2 

91 

2S 

1 

71 

{ 

2S 

2 

91 

i 

28 

3 

60 

L 

28 

A 

70 

29 

1 

223 

29 

2 

33 

j 

29 

3 

98 

29 

A 

AO 

L 

30 

1 

13A 

30 

2 

30 

30 

3 

110 

1^ 

31 

1 

A9 

1 

31 

2 

123 

L 

31 

3 

210 

32 

1 

121 

32 

2 

79 

33 

0 

1100 

1 

3A 

1 

292 

1 

3S 

1 

198 

38 

1 

291 

3A 

2 

189 

37 

0 

1090 

f 

38 

1 

29A 

1 

38 

2 

2A1 

I 

Exhibit  14. 

Sentence  Numbers  of  Document  6  in  Textual  Order 

[ 

! 

with  Weights  Assigned  by  Combined  Method  C-T-L 

OOCUHfNT  NUHUR 


6 


RMI 


1 


RARAMRPH 

SCN7ENCE 

MEIOHr 

1 

0 

1100 

13 

0 

1100 

17 

0 

1090 

1 

0 

1090 

ai 

0 

1040 

23 

0 

1040 

14 

0 

1040 

1 

0 

1040 

10 

0 

1030 

17 

0 

1000 

1 

2 

329 

30 

1 

294 

34 

1 

2S2 

34 

1 

2S1 

4 

2 

290 

31 

2 

241 

29 

1 

223 

2 

2 

21S 

4 

S 

214 

31 

3 

210 

2 

1 

203 

4 

1 

200 

IS 

1 

190 

7 

1 

192 

4 

1 

192 

14 

2 

149 

11 

2 

177 

24 

1 

ISl 

3 

1 

149 

IS 

2 

140 

2 

3 

139 

30 

1 

134 

4 

3 

130 

11 

1 

12S 

31 

2 

123 

4 

4 

122 

22 

2 

121 

32 

1 

121 

24 

1 

117 

30 

3 

110 

9 

1 

104 

4 

2 

100 

27 

1 

100 

14 

1 

100 

24 

2 

94 

22 

1 

99 

IS 

1 

99 

27 

2 

91 

10 

1 

90 

32 

2 

79 

9 

2 

74 

20 

1 

71 

20 

4 

70 

12 

1 

47 

11 

3 

47 

22 

3 

47 

23 

1 

47 

20 

3 

40 

29 

3 

94 

20 

1 

94 

20 

2 

91 

9 

3 

90 

31 

1 

49 

29 

4 

40 

13 

1 

40 

19 

1 

34 

29 

2 

S3 

30 

2 

30 

19 

2 

20 

19 

4 

19 

19 

3 

10 

Exhibit  15.  Sentence  Numbers  of  Document  6  Ordered  by  Decreasing 
Weights  Assigned  by  Combined  Method  C-T-L 


UXT 


PAl  PAS  P 


li 


CUE  NT  KEY  WT  TITLE  HT 


1. 

[ 

I 

I 

[ 

I 

i. 


[ 

I 

I 

I 


start  of 

OUCLHEHT  OOCA 

• 

EVALUATION 

0 

1 

1 

t 

OF 

0 

1 

2 

• 

THE 

0 

1 

3 

• 

EFFECT 

0 

4 

• 

OF 

0 

1 

9 

• 

CtPETHVLANlNE 

0 

1 

4 

• 

BQRINE 

0 

1 

7 

• 

ANC 

0 

i 

B 

a 

SEVERAL 

0 

1 

9 

a 

OTHER 

0 

i 

10 

a 

ACCl TIVES 

0 

1 

ll 

a 

ON 

0 

1 

12 

a 

COPBUSTION 

0 

i 

13 

a 

stability 

0 

X 

14 

a 

CHARACTERISTICS 

0 

1 

19 

a 

OF 

0 

1 

16 

a 

VARIOUS 

0 

1 

17 

hychgcarbon 

0 

I 

16 

a 

TYPE 

0 

1 

19 

a 

FUELS 

0 

1 

20 

a 

IN 

0 

1 

21 

a 

PHILLIPS 

0 

1 

22 

a 

microburner 

0 

1 

23 

( 

A087730 

> 

0 

1 

24 

CUE 

height  0 

REV 

HEIGHT 

0 

TITLE  HEIGHT 

0 

LOCATION 

HEIGHT 

AUTHOR 

aa 

R 

0 

1 

7 

aa 

L 

0 

1 

2 

7 

a 

BRACE 

0 

1 

3 

7 

CUE 

aElGHT  0 

REV 

HEIGHT 

0 

TITLE  HEIGHT 

21 

LOCATION 

HEIGHT 

SUHHE AGING 

a 

SUMMARY 

1 

0 

1 

10 

CUE 

HEIGHT  10 

REV 

HEIGHT 

0 

TITLE  height 

0 

LOCATION 

HEIGHT 

paragraph 

a 

AT 

2 

1 

0 

THE 

2 

1 

2 

0 

RECUeST 

2 

1 

3 

OF 

2 

1 

4 

0 

THE 

2 

1 

5 

0 

a 

NAVY 

2 

1 

6 

a 

BUREAU 

2 

1 

7 

OF 

2 

1 

6 

0 

a 

AERONAUTICS 

2 

X 

9 

a 

PHILLIPS 

2 

X 

10 

14 

11 

a 

PETROLEUM 

2 

X 

ll 

COMPANY 

2 

X 

12 

UNCERTOOK 

2 

X 

13 

the 

2 

X 

14 

0 

evaluation 

2 

X 

15 

11 

OF 

2 

X 

14 

0 

OIMETHYLAMINE 

2 

X 

17 

13 

ll 

BORINE 

2 

X 

18 

13 

ll 

AS 

2 

X 

19 

0 

AN 

2 

X 

20 

0 

ACCl riVE 

2 

X 

21 

19 

FCR 

2 

X 

22 

0 

IMPROVING 

2 

X 

23 

THE 

2 

X 

24 

0 

CCMHUSTION 

2 

X 

29 

16 

11 

CHARACTERISTICS 

2 

X 

24 

11 

OF 

2 

1 

27 

0 

AVIATION 

2 

X 

26 

GAS 

2 

X 

29 

TURBINE 

2 

X 

30 

TYPE 

2 

1 

31 

ll 

FUELS. 

2 

X 

32 

CUE 

HEIGHT  0 

REV 

HEIGHT 

71 

TITLE  HEIGHT 

77 

LOCATION 

HEIGHT 

a 

BECAUSE 

2 

2 

1 

10 

OF 

2 

2 

2 

0 

I 


I 


[ 


PACE  4 


Exhibit  16 


Portion  of  Vertical  Listing  of  Document  6  Produced  by 
Combined  Method  C-K-T-L 


3.  RESEARCH  METHODOLOGY 


The  various  steps  of  the  research  methodology  are  shown  in 
Exhibit  17.  The  research  procedure,  stated  in  its  simplest  terms, 
is  to 

(  1)  Study  the  attributes  of  manual  abstracts  of  documents 
of  a  selected  portion  (Sample  Library)  of  the  Experi¬ 
mental  Library. 

(2)  Formulate  the  definition  of  abstract  suitable  for  the 
particular  nature  of  the  documents  for  which  success 
is  desired. 

(  3)  Program  a  machine  to  produce  abstracts  and  conduct 
experiments  on  documents  of  the  Experimental 
Library. 

(4)  Test  the  programs  on  new  documents  of  the  Test 
Library. 

(5)  Evaluate  the  machine  abstracts. 


3.  1  THE  RESEARCH  PROBLEM 

We  will  begin  with  a  brief  summary  of  the  problem  of  program¬ 
ming  a  machine  to  abstract  a  document.  We  first  assume  that  an 
extract  of  a  document  (i.  e.  ,  a  selection  of  certain  sentences  of  the 
document)  can  serve  as  an  abstract.  *  Thus  the  machine  program 
will  be  a  sentence  selection  routine  which  acts  on  a  document  stored 
in  machine  memory.  To  create  such  a  program,  we  will  suppose 
that  we  have  a  data  base  consisting  of  a  collection  of  documents, 
each  having  an  abstract  prepared  by  a  human.  That  is  to  say,  an 
abstractor  is  given  certain  instructions  concerning  desirable  fea¬ 
tures  of  abstracts  and,  on  the  basis  of  these  instructions,  prepares 
an  abstract  of  a  document.  This  gives  rise  to  the  problems  of  the 
definition  of  abstracts  and  the  nature  of  the  instructions  used  to  pre¬ 
pare  abstracts  as  well  as  the  related  problem  of  the  nature  of  the 


This  hypothesis,  i.  e.  ,  the  substitutability  of  extracts  for  abstracts 
is  discussed  in  References  1  and  7.  In  what  follows  "abstract"  means 
"extract-type  abstract." 


23 


document  collection.  Presuming  the  successful  negotiation  of  these 
matters,  we  consider  then  the  input  (i.  e.  ,  the  sentences  of  the  docu¬ 
ment),  and  the  output  (i.  e.  ,  the  selected  sentences  of  the  document) 
of  the  abstracting  program. 

Clearly,  the  machine  only  can  operate  on  machine-recognizable 
features  of  the  text.  Such  features  we  call  characteristics  (e.g.  , 
occurrence  of  a  certain  word,  position  of  sentence  in  paragraph, 
number  of  words  in  sentence,  etc.).  A  characteristic  is  said  to  be 
relevant  if  it  tends  to  be  associated  with  selected  sentences  of  the 
data  base  (positively  relevant)  or  tends  to  be  associated  with  un¬ 
selected  sentences  (negatively  relevant).  Using  this  notion  we  re¬ 
cast  the  problem  as  follows.  We  (1)  find  certain  relevant  character¬ 
istics  of  the  text,  (2)  program  the  machine  to  recognize  such  char¬ 
acteristics,  (3)  give  the  machine  computational  rules  to  weight 
sentences  according  to  the  presence  of  these  characteristics.  These 
rules  are  determined  from  statistical  and  linguistic  considerations 
concerning  the  data  base,  i.  e.  ,  Sample  Library.  The  final  sentence 
weights  are  then  used  as  sentence -ranking  numbers. 

The  next  step  is  to  examine  the  output  of  this  procedure  and 
make  appropriate  adjustments  in  the  weighting  schemes.  Finally, 
after  making  the  last  adjustments,  we  test  the  technique  by  an  evalua¬ 
tion  procedure  in  which  machine-produced  abstracts  are  compared 
with  manually-produced  abstracts  of  documents  in  the  Test  Library. 

In  the  following  sections  we  discuss  the  details  of  the  major 
research  areas:  The  Research  Problem  (3.  1),  Corpus  (3.  2), 

Theory  (3.  3),  Experiments  (3.  4),  Evaluation  (3.  5).  The  research 
steps  are  diagrammed  in  Exhibit  17. 

3.  2  CORPUS 

Documents  taken  from  a  particular  corpus  or  body  of  text  have 
strong  similarities  among  one  another  and  strong  dissimilarities 
with  documents  taken  from  a  different  corpus.  This  means  that  one 
of  the  first  steps  in  conducting  research  in  automatic  abstracting  is 
that  of  choosing  an  appropriate  corpus.  For  example,  challenges 


24 


EXHIBIT  17  MM4«M  MtlHOWHMr  fW MnOMATiC  AMTUMTIM  STUOT 


arise  due  to  the  subject  matter,  e.  g.  ,  education  vs.  mathematics; 
to  the  publishing  medium,  e.  g.  ,  newspaper  vs.  textbooks;  to 
editors'  rules  regarding  acceptability  for  publication,  e.g.  ,  re¬ 
search  papers  vs.  expository  works;  and  to  the  author's  style  and 
compactness  of  presentation. 

3.  2.  1  Heterogeneous  Corpus 

The  corpus  of  the  previous  contract  was  used  in  the  present 
study  primarily  as  a  basis  for  research  data.  The  Heterogeneous 
Corpus  contained  200  documents  comprising  4  different  classes  of 
subject  matter:  Physical  Science,  Life  Science,  Information  Science, 
and  Humanities.  Because  of  its  heterogeneity  and  size,  this  corpus 
was  deemed  adequate  for  some  of  the  statistical  data  desired,  e.g.  , 
data  on  sentence  length,  data  on  common  words,  and  positional  data. 
The  details  of  these  data  are  given  in  Section  3.  3.  4. 

In  addition,  a  special  batch  T£  of  1 1  documents  from  the 
Heterogeneous  Corpus  was  abstracted  by  the  methods  developed 
under  the  current  contract  to  (1)  compare  with  corresponding  ab¬ 
stracts  produced  under  the  previous  method,  and  (2)  to  test  the 
routines  designed  for  a  homogeneous  corpus  (exotic  fuel)  on  a 
heterogeneous  one. 

3.  2.  2  Exotic  Fuel  Corpus 

The  corpus  used  in  the  present  contract  consisted  of  200  docu¬ 
ments  dealing  with  the  chemistry  of  exotic  fuels.  These  200  docu¬ 
ments  were  obtained  from  ASTIA  as  either  originals  or  copies  of 
contractor  reports  to  various  government  agencies.  They  were 
selected  by  examining  over  a  thousand  ASTIA  cards  and  choosing 
documents  whose  length  did  not  exceed  4096  words;  re-examination 
showed  this  threshold  to  be  3350  words.  This  condition  of  length 
was  imposed  by  the  nature  of  our  computer  system  because  of  limited 
storage  capacity  in  the  computer  memory.  Of  the  200  selected  docu¬ 
ments,  85  were  unclassified  and  115  were  confidential. 


25 


Since  the  documents  in  the  Exotic  Fuel  Corpus  were  technical 
reports  written  by  chemists,  chemical  engineers,  and  physicists, 
the  style  of  presentation  was  that  of  a  typical  scientific  report, 
i.  e.  ,  highly  formatted,  terse,  and  containing  equations  and  experi¬ 
mental  figures.  The  lengths  of  the  documents  ranged  from  a  mini¬ 
mum  of  100  words  to  a  maximum  of  3900  words  with  an  average  of 
approximately  2500  words. 

For  convenience  in  machine  processing,  the  Exotic  Fuel  Corpus 
was  divided  into  several  batches  which  comprised  the  Experimental 
Library  and  the  Test  Library.  The  following  table  presents  the  de¬ 
tails  of  this  batching: 


Summary  of  Batches 


Experimental  Library  (EL) 

Test  Library  (TL) 

Batch 

No.  of 
Documents 

Classifi¬ 

cation 

Batch 

No.  of 
Documents 

Classifi¬ 

cation 

A 

9 

U 

TA 

20 

U 

B 

20 

U 

TB 

20 

U 

C 

12 

U 

TC 

38 

C 

D 

55 

C 

TD 

22 

C 

E 

4 

u 

Total 

100 

100 

3.  2.  3  Libraries 

As  seen  in  Exhibit  17,  which  outlines  the  research  methodology 
for  the  automatic  abstract  study,  two  libraries  are  necessary.  Ex¬ 
perimental  Library  and  Test  Library  are  defined  as  follows:  the 
Experimental  Library  is  a  source  of  documents  for  the  data  base  and 
experimentation;  the  Test  Library  is  a  collection  of  documents  re¬ 
served  for  evaluation  after  experimentation  is  completed.  Documents 
that  are  selected  from  the  Experimental  Library  for  detailed  study 


26 


are  said  to  comprise  the  Sample  Library,  which  serves  as  the  re¬ 
search  data  base. 

The  following  table  summarizes  the  number  of  documents  and 
abstracts  falling  in  each  of  the  two  libraries  for  both  the  Heterogeneous 
Corpus*  and  the  Exotic  Fuel  Corpus; 


Summary  of  Libraries 


Documents 

Manual 

Abstracts 

Automatic 

Abstracts 

Hetero¬ 

geneous 

Corpus 

Exotic 

Fuel 

Corpus 

Hetero¬ 

geneous 

Corpus 

Exotic 

Fuel 

Corpus 

Hetero¬ 

geneous 

Corpus 

Exotic 

Fuel 

Corpus 

Experi¬ 

mental 

Library 

100 

100 

100 

27 

20 

100 

Test 

Library 

100 

100 

12 

40 

42 

100 

Total 

200 

200 

112 

67 

62 

200 

400 

179 

262 

3.  3  THEORY 

3.  3.  1  Definition  and  Creation  of  Abstracts 

In  the  course  of  this  research  the  problem  formulation  pre¬ 
viously  derived  from  a  consideration  of  the  nature  of  abstracts** 
was  reviewed.  A  reasonable  goal  of  the  present  study  was  development 


The  meaning  of  Sample  Library  and  Experimental  Library  used  in 
the  present  study  is  not  that  of  Reference  7. 

**See  References  1  and  7. 


27 


of  a  method  of  machine -producing  indicative -type  abstracts  for  use 
in  the  screening  of  documents.  Because  of  the  special  nature  of 
the  Exotic  Fuel  Corpus  it  was  decided  that  such  a  function  of  abstracts 
could  be  achieved  by  specifying  the  information  content  of  abstracts  of 
technical  papers.  With  this  goal  in  mind,  a  semiformal  explication  of 
the  notion  of  abstract  was  developed.  This  comprises  a  sequence  of 
four  definitions:  (1)  Eligible  Sentence’ll,  (2)  Nonredundant  Sentences, 
(3)  Coherence,  and  (4)  Abstract. 


Definition  1.  Eligible  Sentence.  A  sentence  is  called  eligible 
if  it  contains  information  of  at  least  one  of  the  following  six  types: 

S  Subject  Matter.  Information  indicating  the  general  subject 
area  that  is  the  author's  principal  concern;  i.  e. ,  what? 

P  Purpose.  Information  indicating  whether  the  author's 
principal  intent  is  to  offer  original  research  findings,  to 
survey  or  evaluate  the  work  performed  by  others,  to  pre¬ 
sent  a  speculative  or  theoretical  discussion,  or  to  serve 
some  other  main  purpose;  i.  e.  ,  why? 

M  Methods.  Information  indicating  the  methods  used  in  con¬ 
ducting  the  research.  Depending  on  the  type  of  research, 
such  statements  may  refer  to  experimental  procedures, 
mathematical  techniques,  or  other  methods  of  scientific 
investigation;  i.  e.  ,  how? 

C  Conclusions  or  Findings.  Information  indicating  the  results 
obtained  in  the  research  or  the  findings  of  the  author. 

G  Generalizations  or  Implications.  Information  indicating  the 
significance  of  the  research  and  its  bearing  on  broader  tech¬ 
nical  problems  or  theory. 

R  Recommendations  or  Suggestions.  Information  indicating 
recommended  courses  of  action  or  suggested  areas  of 
future  work. 


The  neutral  word  "eligible"  is  used  rather  than  the  word 
"significant.  " 


28 


Definition  2.  Nonredundant  Sentence.  A  sentence  or  group 
of  sentences  will  be  called  nonredundant  if  it  has  none  of  the  follow¬ 
ing  properties: 

(1)  It  is  an  exact  replica  of  another  sentence  or  a  smaller 
group  of  sentences. 

(2)  It  is  a  paraphrase  of  another  sentence  or  a  smaller  group 
of  sentences. 

(3)  It  expresses  the  same  content  as  another  sentence  or  a 
smaller  group  of  sentences. 

Definition  3.  Coherence.  A  sequence  of  sentences  is  said 
to  be  coherent  if  it  has  the  following  properties: 

(1)  All  crucial  antecedents  and  referents  are  present. 

(2)  No  semantic  discontinuities  are  present. 

(3)  The  sequence  of  ideas  progresses  logically. 

Definition  4.  Abstract.  A  sequence  of  sentences  of  a  docu¬ 
ment  selected  in  text  order  is  said  to  be  an  abstract  of  the  document 
if  it  has  the  follov'ing  properties: 

(1)  Property  of  Content:  It  contains  only  eligible  sentences. 

(2)  Property  of  Length:  It  contains  nonredundant  sentences. 

(3)  Property  of  Form:  It  is  coherent. 

A  set  of  Instructions  for  Abstractors  (Exhibit  18)  was  then  de¬ 
veloped  in  order  to  create  abstracts  satisfying  the  above  definition. 
Abstracts  generated  in  accordance  with  these  instructions  then  serve 
as  the  target  abstracts  required  by  the  research  methodology.  Be¬ 
cause  of  the  requirements  of  a  mechanized  technique,  it  was  stipu¬ 
lated  that  the  target  abstracts  contain  a  fixed  percent  of  the  sentences 
of  the  original  document.  We  chose  25  percent,  but  believe  that  is 
not  optimum. 


29 


During  the  preparation  of  these  abstracts  a  document  analysis 
sheet  was  also  prepared.  This  gives  an  itemization  of  the  sentences 
with  (1)  notations  as  to  the  occurrence  of  cointensional  sentences, 

(2)  classification  of  sentences  into  the  six  information  categories, 
and  (3)  qualitative  judgments  as  to  the  abstract-worthiness  of  sen¬ 
tences.  The  selected  sentences  are  then  checked  on  the  analysis 
sheet  in  the  Column  H  ("H"  for  "human").  A  Column  R  is  also  pro¬ 
vided  to  record  a  random  selection  of  sentences.  The  randomly 
generated  extract,  so  defined,  is  then  used  as  a  second  control  in 
the  evaluation  procedure.  Exhibit  19  gives  the  completed  analysis 
sheet  for  Exotic  Fuel  Document  6  (Exhibit  3).  The  corresponding 
Target  Abstract  is  given  in  Exhibit  20  and  a  Random  Extract  in 
Exhibit  21. 

A  major  part  of  the  abstracting  effort  was  devoted  to  locating 
duplications  of  information  throughout  the  document.  A  typical  ab¬ 
stract  in  the  corpus  examined  presented  a  summary  at  the  beginning, 
a  section  of  orientation,  a  section  of  methods,  a  discussion  section, 
and  frequently  conclusions  which  were  a  rephrasing  of  the  initial 
summary.  Hence  the  same  information  frequently  appeared  two  or 
three  places  in  a  document.  Occasionally,  however,  the  opposite 
situation  prevailed,  wherein  nearly  the  entire  document  comprised 
conclusions  of  the  research.  When  a  hierarchy  cannot  be  established, 
either  of  significance  or  of  generality,  then  a  25  percent  abstract  in 
the  sense  of  a  sampling  of  content  cannot  be  logically  composed.  In 
this  case  a  description  is  the  appropriate  representation.  If  the 
document  contains  one  it  is  rarely  25  percent  of  the  sentences;  if 
it  does  not,  the  present  extracting  method  will  not  produce  a  co¬ 
herent  abstract.  Also,  in  order  to  specify  the  full  25  percent  of 
the  sentences,  occasionally  it  was  necessary  to  select  several  less 
condensed  sentences  in  place  of  the  single  more  succinct,  more 
"meaty"  one  which  was  their  equivalent.  Although  a  25  percent  selec¬ 
tion  seemed  workable  for  most  of  the  documents,  in  some  instances 
it  was  not  appropriate,  most  frequently  by  exceeding  optimum  length. 


30 


Step  i. 

NUMBER: 

Assign  numerical  designations  to  the 
sentences  of  the  document. 

Step  2. 

ORIENT: 

Read  the  definitions  of  eligibility,  non¬ 
redundancy,  coherence,  and  abstract. 

Step  3. 

FAMILIARIZE; 

Read  the  article  quickly  to  gain 
familiarity. 

Step  4. 

CLASSIFY; 

Reread  the  article  slowly,  classifying 
successive  sentences  if  eligible.  Record 
the  class  on  the  document  analysis  sheet. 

If  a  sentence  is  ineligible  or  obviously  not 
abstract- worthy,  mark  nothing. 

Step  5. 

LOCATE 

DEPENDENCIES: 

Note  with  an  arrow  on  the  analysis  sheet 
if  an  antecedent  sentence  would  have  to 
be  selected  also  in  order  to  preserve  the 
meaning. 

Step  6. 

LOCATE 

REDUNDANCIES; 

Reread  the  article  to  locate  redundancies 
and  note  them  in  as  great  detail  as  neces¬ 
sary  on  the  analysis  sheet. 

Step  7. 

SELECT: 

Select  25  percent  of  the  number  of  sentence! 
in  the  document  by  applying  the  principle  of 
coherence  and  a  comparative  notion  of 
sentence  importance.  Avoid  redundancy. 

Step  8. 

REVIEW: 

Reread  the  finished  abstract  to  check  its 

coherence,  conformity  to  prescribed 
length,  and  freedom  from  redundancy. 


Exhibit  18.  Instructions  for  Abstractors 


Exhibit  19-  Analysis  of  Document  6  and  Comparison  of  Human  and  Machine  Abstracts 


Notes  on  Exhibit  19 


1.  Abstracts  are  designated  according  to  the  method  of  selection: 

H  Human  (elsewhere  referred  to  as  the  "Target"  abstract) 

CTL  Cue-Title-Location  combined  method,  chosen  as  the  preferred 
method 

C  Cue  method 

T  Title  method 

L  Location  method 

K  Key  method 

CTLK  Cue-Title-Location-Key  combined  method 
R  Random  selection 

2.  All  headings  (designated  by  a  "0"  sentence  number)  are  output  in  all 
machine  abstracts,  although  not  checked  in  the  table.  They  have  been 
inserted  in  the  Target  Abstract  and  Random  Extract  (Exhibits  20  and 
21)  for  purposes  of  comparison  with  the  machine  abstracts. 

3.  The  class  designations  of  sentences  are: 


s 

Subject 

p 

Purpose 

M 

Method 

C 

Conclusion 

G 

Generalization 

R 

Recommendation 

4.  The  following  symbols  are  used  to  indicate  cointensional  sentences. 
Sentences  are  identified  by  the  paragraph  and  sentence  number 
assigned  by  the  computer. 

18  equivalent  to 
is  approximately  equivalent  to 
contains  more  information  than 
contains  less  information  than 
the  two  sentences  taken  together 

5.  An  arrow  indicates  that  for  preservation  of  meaning  an  antecedent 
sentence  is  required.  Note  is  made,  when  possible,  of  the  words 
indicating  this  dependence. 

6.  Sentences  marked  with  an  asterisk  are  no  longer  meaningful  be¬ 
cause  they  contain  or  refer  to  essential  symbols,  equations,  tables, 
figures,  and  graphs  which  have  been  deleted  in  the  pre-editing  step. 


% 

> 

< 

} 


Exhibit  19.  Analysis  of  Document  6  and  Comparison  of  Human  and 
Machine  Abstracts  (Continued) 


PAR  SENT 


DOCUMENT  NUMBER  i 


1. 

[ 

[ 

i; 


ABSTRACT  BASED  ON  HUMAN  SELECTION 

EVALUATION  OT  THE  EFEECT  OP  DtMETHYLAMINE  BORINE  AND  SEVERAL  OTHER  ADDITIVES  ON 
COMBUSTION  STABIUTY  CHARACTERISTICS  Of  VARIOUS  HYDROCARBON  TYPE  FUELS  IN  PHILLIPS 
MICROBURNER  (AOS7TM) 

R.  L.  BRACE 


I 


21 

24 


SUMMARY 

AT  THE  REQUEST  OF  THE  NAVY  BUREAU  OF  AERONAUTICS,  PHILLIPS  PETROLEUM 
COMPANY  UNDERTOOK  THE  EVALUATION  OF  DIMETHYLAMINE  BORINE  AS  AN  ADDITIVE 
FOR  IMPROVmC  THE  COMBUSTION  CHARACTERISTICS  OF  AVUTION  GAS  TURBINE  TYPE 
FUELS. 

BECAUSE  OF  THE  SAMLL  AMOUNT  (100  CRAMS)  OP  DIMETHYLAMINE  BORINE  RECEIVED 
FROM  CALLERY  CHEMICAL  COMPANY,  THIS  EVALUATION  HAS  BEEN  LIMITED  TO  THE 
MEASUREMENT  OF  ITS  EFFECT  ON  THE  FLASH-BACK  CHARACTERISTICS  OF  THREE 
PURE  HYDROCARBONS  (TOLUENE.  NORMAL  HEPTANE  AND  BENZENE)  IN  THE  PH1LUP8 
MICROBURNER. 

DtMETHYLAMINE  BORINE  CONCENTRATIONS  OF  FROM  0.1  to  1.0  PER  CENT  BY  WEIGHT 
WERE  EVALUATED. 

FOR  COMPARATIVE  PURPOSES  TWO  COMMON  IGNITION  ADDITIVES  (AMYL  NITRATE 
AND  CUMENE  HYDROPEROXIDE)  WERE  ALSO  EVALUATED  DURING  THIS  STUDY.  AS 
WELL  AS  CONCENTRATIONS  UP  TO  20  PER  CENT  BY  WEIGHT  OF  PROPYLENE  OXIDE  - 
A  RELATIVELY  HIGH  FLAME  VELOCITY  FUEL. 

PREVIOUS  STUDIES  IN  PHILLIPS  2  INCH  TURBOJET  ENCD2E  TYPE  COMBUSTOR  HAD 
INDICATED  THAT  SUCH  MATERULS  COULD  SUBSTANTULLY  INCREASE  THE  MAXIMUM 
RATE  OF  HEAT  RELEASE  ATTAINABLE.  ESPECULLY  WITH  LOW  PERFORMANCE 
FUELS  SUCH  AS  THE  ISO  PARAFFIN  TYPE  HYDROCARBONS  •  PARTICULARLY  WHEN 
OPERATINQ  UNDER  SEVERE  CONDITIONS  FOR  COMBUSTION  (I.  E. .  HIGH  AIR  FLOW 
VELOCITY  OR  LOaT  COMBUSTION  PRESSURE). 

WITH  RESPECT  TO  THE  DIMETHYLAMINE  BORINE.  ITS  EFFECT  AS  A  FUEL  ADDITIVE 
WAS  NOTEWORTHY;  0. 1  WEIGHT  PER  CENT  IN  TOLUENE  BEING  EQUIVALENT  TO  20 
PER  CENT  BY  WEIGHT  OF  ADDED  PROPYLENE  OXIDE. 

IN  GENERAL.  ADDITIVE  CONCENTRATIONS  OF  ONE  PER  CENT  BY  WEIGHT  IN  THE 
SEVERAL  PURE  HYDROCARBONS  WHICH  NORMALLY  DIFFERED  QUITE  WIDELY  IN  PER- 
FORMANrx.  produced  UNIFORMLY  SUPERIOR  COMBUSTION  STABILITY  CHARACTERIS¬ 
TICS  AS  MEASURED  USING  THE  PHILLIPS  MICROBURNER. 

I.  INTRODUCTION 

U.  DESCRIPTION  OF  PHILLIPS  MICROBURNER  (MODEL  lA) 
tU.  DESCRIPTION  OP  TEST  APPARATUS 
IV.  DESCRIPTION  OF  TEST  FUELS 

THESE  FUEl^  REPRESENT  VARIATIONS  IN  CHEMICAL  STRUCTURE  WHICH  WILL  IN 
TURN  PROVIDE  INDICES  OF  BOTH  GOOD  AND  POOR  COMBUSTION  STABILITY  PER¬ 
FORMANCE. 


0  V.  TEST  PROCEDURE 

0  VI.  RESULTS 

1  THE  REGION  OF  STABLE  OPERATION  IS  DEFINED  AS  THE  STATE  OF  FLASH  BACK" 

THE  CONDITIONS  OF  COMBUSTION  WHERE  THE  FLAME  WOULD  BECOME  ANCHORED 
TO  A  FLAME  HOLDER  •  AS  IN  STABLE  CAS  TURBINE  OR  RAM  JET  COMBUSTOR  OPERA¬ 
TION  -  IF  THE  FLAME  HOLDER  WERE  PROVIDED  IN  THE  BURNER  TUBE. 

0  VU  DISCUSSION 


[ 

I 


[ 

[ 


TH£  ASSUMPTION  IS  MADE  THAT  THE  GREATER  THE  A1.LOWADIJ;  BEAT  INPUT  RATE 
AT  A  GIVEN  VELOCITY,  THE  GREATER  THE  DEGREE  OT  STABILITY. 

ALL  FOUR  ADDITIVES  INDICATED  THEIR  ADDITION  TO  BE  SUBJECT  TO  THE  EFFECT 
OF  DEMINISHING'RESULTS  UPON  FURTHER  ADDITION  -  THAT  IS.  THEIR  EFFECT  WAS 
NOT  ESSENTIALLY  A  BLENDING  EFFECT. 

MENTION  SHOULD  BE  MADE  OF  THE  FACT  THAT  DURING  THE  COMBUSTION  OF  THE 
DIMETHYLAMINE  BORINE-HTDROCARBON  FUEL  BLENDS  NO  NOTICEABLE  ODORS  OR 
SMOKE  WERE  OBSERVED. 

Via.  CONCLUSIONS 

J.  THE  ADDITION  OF  ADDITIVE  CONCENTRATIONS  (UP  TO  1  PER  CENT)  OF  AMYL 
NITRATE.  CUMIXE  HYDROPERIOXIDE.  AND  DIMETHYLAMINE  BORINE  ALL  RESULTED 
IN  IMPROVED  STABILITY  PERFORMANCE;  THE  GREATEST  INCREASES  WERE  SHOWN 
WHEN  BLENDED  WITH  A  FUEL  OF  POOR  PERFORMANCE  CHARACTERISTICS  -  SUCH  AS 
TOLUENE. 

BENEFICIAL  EFFECTS  WERE  APPRECIABLY  LESS  WHEN  BLENDED  WITH  A  FUEL  OF 
GOOD  PERFORMANCE  CHARACTERISTICS  -  SUCH  AS  N. HEPTANE. 

IX.  RECOMMENDATIONS 

BASED  ON  THE  EVALUATION  OF  THE  EFFECTS  OF  ADDITIVES  ON  THE  FLASHBACK 
LIMITS  OF  THE  ADDITIVE-FUEL  BLENDS  TESTED  IN  THE  MICROBURNER  (MODEL  lA) 
IT  IS  RECOMMENDED  THAT  DIMETHYLAMINE  BORINE  SHOULD  BE  FURTHER 
INVESTIGATED. 

THIS  FUTURE  WORK  SHOULD  INCLUDC  STUDY  OF  COMBUSTION  STABILITY  AND 
COMBUSTION  EFFICIENCY  EFFECTS  IN  THE  PHILLIPS  2  INCH  COMBUSTOR  AND  AN 
INVESTIGATION  OF  ITS  INFLUENCE  ON  COMBUSTION  CLEANLINESS. 


Exhibit  20.  Target  Abstract 


PAR  SENT 


DOCUMENT  NUMBER  6 


1 


EXTRACT  BASED  ON  RANDOM  SELECTION 

EVALUATION  OT  THE  EPFECT  07  DIMETHYLAMINE  BORINE  AND  SEVERAL  OTHER  ADDITHTES  ON 
COMBUSTION  STABILITY  CHARACTERISTICS  07  VARIOUS  HYDROCARBON  TYPE  7UELS  IN  PHILUPS 
MXCROBURNER  (ADS7730) 

R.  L.  BRACE 

1  0  SUMMARY 

3  2  PREVIOUS  STUDIES  IN  PHILUPS  2  INCH  TURBOJET  ENGINE  TYPE  COMBUSTOR  HAD 

INDICATED  THAT  SUCH  MATERIAL  COULD  SUBSTANTIALLY  INCREASE  THE  MAXIMUM 
RATE  07  HEAT  RELEASE  ATTAINABLE.  ESPECIALLY  WITH  LOW  PER70RMANCE 
7UELS  SUCH  AS  THE  ISO  PARA77IN  TYPE  HYDROCARBONS  •  PARTICULARLY  WHEN 
OPERATING  UNDER  SEVERE  CONDITIONS  TOR  COMBUSTION  (I.  E. .  HIGH  AIR  7L0W 
VELOCITY  OR  LOW  COMBUSTION  PRESSURE). 

4  3  THESE  DATA  ARE  IN  AGREEMENT  WITH  PREVIOUS  ADDITIVE  STUDIES  BY  PHILUPS. 

5  0  1.  INTRODUCTION 

6  2  THE  DIMETHYLAMINE  BORINE  WAS  SUPPLIED  TO  PHILUPS  BY  THE  GALLERY 

CHEMICAL  COMPANY. 

BO  U.  DESCRIPTION  07  PHILUPS  MICROBURNER  (MODEL  lA) 

9  3  THE  DETAILS  07  THE  MODEL  lA  ARE  SHOWN  IN  FIGURE  1. 

to  0  UL  DESCRIPTION  OF  TEST  APPARATUS 

11  2  IN  THE  PRESENT  EVALUATION  IT  WAS  NOT  NECESSARY  TO  CONSIDER  THE  E77ECT 

07  CORROSION.  CONSEQUENTLY  A  CONTINUOUS  FLOW  SYSTEM  PROVIDING  GREATER 
7LEXIBIUTY  AND  EASIER  HANDUNG  WAS  INCORPORATED  WHICH  REQUIRES  ONLY 
SLIGHTLY  MORE  FUEL  PER  TEST  THAN  THE  ORIGINAL. 


14 

15 

17 

19 


21 

22 


25 

20 


29 


33 

36 


THE  DETAILS  OF  THESE  MODIFICATIONS  AND  OF  1HE  TEST  APPARATUS  ARE  SHOWN 
IN  SCHEMATIC  IN  FIGURE  2. 

IV.  DESCRIPTION  OF  TEST  FUELS 

THESE  FUELS  REPRESENT  VARIATIONS  IN  CHEMICAL  STRUCTURE  WHICH  WILL  IN  TURN 
PROVIDE  INDICES  OF  BOTH  GOOD  AND  POOR  COMBUSTION  STABIUTY  PERFORMANCE. 

V.  TEST  PROCEDURE 

IGNITION  OF  THE  THEN  FUEL-RICH  MIXTURE  WAS  ACCOMPUSHED  BY  APPLYING  A 
LIGHTED.  PORTABLE  PROPANE  TORCH  TO  THE  TOP  OF  THE  BURNER  TUBE. 

AFTER  CHECKING  THE  POINT  AT  LEAST  ONCE  MORE  THE  AIR  FLOW  WAS  INCREASED 
ANOTHER  INCREMENT  AND  THE  PROCESS  REPEATED. 

VI.  RESULTS 

THE  RESULTS  OF  THE  EVALUATION  OF  FLASHBACK  UMIT5  OF  THE  FUELS  AND  FUEL- 
ADDITIVE  BLENDS  ARE  SUMMARIZED  IN  TABLE  2  AND  SHOWN  GRAPHICALLY  IN 
FIGURE  3  THROUGH  «. 

THE  REFERENCE  VELOCITY  IS  DETERMINED  BY  THE  AIR  FLOW  CONDITIONS  AT 
ENTRY  TO  THE  BURNER  TUBE  -  NEGLECTING  THE  MASS  OF  THE  FUEL  PARTICLES. 

VII.  DISCUSSION 

THIS  EVALUATION  SERVED  PRIMARILY  AS  A  REFERENCE  PLANE  OF  PERFORMANCE 
IMPROVEMENT. 

THEREFORE.  IN  ADDITION  TO  THE  EVALUATION  OF  THE  FUEL  BLENDS  CONTAINING 
0.  1.  0.  5.  AND  I  PER  CENT  DIMETHYLAMINE  BORINE  IN  TOLUENE  (SHOWN  IN  HGURE 
5)  THE  SAME  ADDITIVE  WAS  TESTED  IN  N-HEPTANE  (SHOWN  IN  FIGURE  6). 

VARIFICATION*OF  THE  PRIOR  RESULTS  WAS  FURTHER  ESTABUSHED  BY  TESTING  TWO 
OF  THE  ADDITIVES  PREVIOUSLY  EVALUATED  (REF  2.  ) 

ALL  FOUR  ADDITIVES  INDICATED  THEIR  ADDITION  TO  BE  SUBJECT  TO  THE  EFFECT 
OF  DEMINISHINCf  RESULTS  UPON  FURTHER  ADDITION  -  THAT  IS,  THEIR  EFFECT  WAS 
NOT  ESSENTIALLY  A  BLENDING  EFFECT. 

VUI.  CONCi.USIONS 

BENEFICIAL  AFFECTS  WERE  APPRECIABLY  LESS  WHEN  BLENDED  WITH  A  FUEL  OF  . 
GOOD  PERFORMANCE  CHARACTERISTICS  -  SUCH  AS  N-HEPTANE. 


IX,  RECOMMENDATIONS 


Exhibit  21.  Random  Extract 


An  attempt  was  made  to  classify  qualifying  sentences  as  to 
degree  of  "abstract-worthiness"  (a  concept  analogous  to  that  of  the 
machine  weightings),  but  in  practice  it  did  not  prove  a  sufficient 
method  for  sentence  selection.  It  is  often  impossible  to  tell  if  a 
sentence,  seen  only  in  the  context  of  an  abstract,  reports  well-known 
fact,  the  result  of  previous  experiment,  or  if  it  is  a  conclusion  of  the 
present  study.  Furthermore,  a  sentence  frequently  depends  on  a 
previous  one  for  its  meaning  (and  there  may  be  a  series  of  such  de¬ 
pendencies)  requiring  selection  of  both,  even  though  the  antecedent 
sentence  may  not  qualify  in  terms  of  content.  Certain  words  or 
phrases  frequently  indicate  this  situation,  such  as  "this,"  "therefore," 
"since, "  etc.  ,  and  a  study  of  their  occurrence  might  be  useful  in  re¬ 
fining  the  machine  method.  The  more  an  author  uses  such  words  for 
reasons  of  logical  structuring  and  stylistic  fluidity,  the  more  difficult 
it  is  to  remove  sentences  from  context  without  destroying  their  func¬ 
tion  in  the  document. 

In  short,  it  is  found  that  in  composing  as  coherent  and  mean¬ 
ingful  an  abstract  as  possible,  requirements  of  antecedents,  absence 
of  elements  eliminated  by  pre-editing,  suppression  of  redundancy, 
and  considerations  related  to  the  length  quota  often  take  precedence 
over  a  sentence-by-sentence  rating  of  "abstract-worthiness. "  It 
is  in  precisely  these  aspects  that  the  human  extracts  differ  con¬ 
sistently  from  the  machine-produced  ones. 

3.  3.  2  Guiding  Principles 

A  set  of  principles  was  next  devised  to  guide  the  development 
of  automatic  abstracting  methods.  Among  the  several  principles, 
we  stress  one  that  seems  dominant: 

Principle  1.  Insure  that  the  automatic  method  detects  and 
uses  all  abstracting  clues  (e.g.  ,  of  meaning, 
significance,  organization,  etc. )  provided  by 
the  author,  editor,  and  printer. 


31 


This  principle  focuses  on  capturing  automatically  as  many  clues  as 
possible  that  were,  either  consciously  or  unconsciously,  provided 
by  the  creators  of  the  document.  For  example,  the  skilled  author 
selects  an  appropriate  title,  organizes  his  thoughts  in  distinct  sec¬ 
tions  with  appropriate  subtitles,  condenses  information  in  the  cap¬ 
tions  of  graphs  and  tables,  and  uses  footnotes  and  references  in  re¬ 
vealing  ways. 

It  is  instructive  to  regard  the  problem  of  automatic  abstracting 
in  the  light  of  several  other  principles; 


Principle  2.  Use  criteria  of  selection,  i.  e.  ,  a  system 
of  rewards  for  desired  sentences. 


Principle  3.  Use  criteria  of  rejection,  i.  e.  ,  a  system 
of  penalties  for  undesired  sentences. 


Principle  4.  Use  a  system  of  thresholds,  both  for  ac¬ 
ceptance  and  rejection,  that  allows  adjust¬ 
ments  by  parameterization. 

Principle  5.  Use  a  method  which  is  a  function  of  several 

distinct  factors,  such  as  statistical,  semantic, 
syntactic,  locational,  etc. 


It  thus  appears  that  an  abstracting  system  based  on  assigning 
numerical  weights  (weights  above  a  threshold  for  positively  relevant 
characteristics,  at  the  threshold  for  irrelevant  characteristics,  be¬ 
low  the  threshold  for  negatively  relevant  characteristics)  to  machine 
recognizable  sentence  characteristics,  will  satisfy  these  principles. 
For  computational  simplicity,  addition  of  the  weights  is  used  to 
arrive  at  the  final  ranking  numbers. 

In  the  next  section  we  describe  four  basic  methods  which  rely 
upon  the  five  principles  stated  above.  It  should  be  noted,  however, 
that  in  regard  to  Principle  1  we  developed  methods  utilizing  clues 
in  the  title  and  subtitles  but  were  unable,  because  of  limited  time 
and  scope,  to  program  in  the  operating  system  clues  known  to  exist 
in  captions  of  tables,  footnotes,  and  references. 


32 


3.  3.  3  The  Four  Basic  Methods 


The  automatic  abstracting  system  developed  under  this  con¬ 
tract  is  founded  on  four  basic  methods:  Key,  Cue,  Title,  Location. 
Their  origin  will  now  be  described. 

First,  the  clues  of  the  document  may  come  from  two  structural 
sources. 

(1)  Clues  in  the  skeleton  of  the  document,  e.g.  ,  titles, 
headings,  format. 

(2)  Clues  in  the  body  of  the  document,  e.  g.  ,  the  text. 

Second,  the  characteristics  of  the  corpus  may  be  considered 
from  two  linguistic  points  of  view. 

(1)  General  characteristics  of  the  entire  corpus;  e.g.  , 
certain  function  words. 

(2)  Specific  characteristics  of  the  individual  documents; 
e.  g.  ,  high  frequency  content  words. 

These  two  sources  of  clues  and  two  types  of  linguistic  prop¬ 
erties  yield  four  opportunities  to  create  distinct  basic  methods  of 
automatic  abstracting  defined  simply  by  the  class  of  clues  they  rely 
upon.  They  are  displayed  below: 


Rationale  of  the  Four  Basic  Methods 


Type  of 

Linguistic 

Property 

Sources  of  Structural  Clues 

Body  of  Document 
(Text) 

Skeleton  of  Document 
(Title,  Headings,  Format) 

General 
Characteristics 
of  Corpus 

Cue  Method 

Location  Method 

Specific 
Characteristics 
of  Document 

Key  Method 

Title  Method 

33 


When  this  classification  is  applied  to  words  (considered  as 
clues)  it  yields  four  distinct  word  lists.  Of  these,  we  distinguish 
between  two  different  types  of  word  lists. 

First:  A  dictionary  is  a  word  list  plus  numerical  tags  which 
forms  a  fixed  input  to  the  automatic  abstracting  system;  thus  a 
dictionary  is  independent  of  the  words  in  the  document  being  ab¬ 
stracted. 

Second:  A  glossary  is  a  word  list  plus  numerical  tags  which 
forms  a  variable  input  to  the  automatic  abstracting  system;  thus  a 
glossary  is  dependent  on  the  words  of  the  document  being  ab¬ 
stracted. 

This  refinement  and  standardization  of  terminology  has  pro¬ 
vided  a  convenient  breakdown  of  the  word  lists  that  correspond  to 
each  of  the  four  basic  methods.  The  table  below  shows  this  re¬ 
lationship: 


Rationale  of  the  Four  Word  Lists 


Body  of  Document 

Skeleton  of  Document 

Function  Words 

Cue  Dictionary 
(Bonus,  Stigma, 
and  Null 
Dictionaries) 

Heading  Dictionary 

Content  Words 

Key  Glossary 

Title  Glossary 

The  method  of  generating  the  Key  and  Title  Glossaries  given 
the  Cue  Dictionary  has  been  described  in  Section  2. 

Exhibit  22  presents  an  inventory  of  the  sentence  character¬ 
istics  considered  in  the  course  of  the  research.  The  rules  governing 
their  use  in  the  abstracting  programs  developed  have  been  given  in 
Section  2. 


34 


3.  3.  4  Characteriatics,  Dictionaries,  and  Weights 


(1)  Cue  Dictionary.  The  previous  research  used  two  distinct 
Cue  Dictionaries:  (1)  a  dictionary  based  on  purely  statistical  con¬ 
siderations,  (2)  a  dictionary  based  on  purely  linguistic  considera¬ 
tions.  *  In  the  present  research  it  was  decided  to  base  the  new  Cue 
Dictionary  on  a  combination  of  statistical  and  linguistic  properties 
by  taking  the  following  steps. 

Step  1.  Selection  of  Candidates  for  Cue  Words.  The  Sample 
Library  of  the  Heterogeneous  Corpus  used  in  the  previous  research 
consisted  of  16,  386  different  words.  The  output  data  of  the  Con¬ 
cordance  Program*  gave  the  following  statistics  for  each  word: 

(1)  frequency  in  corpus;  (2)  number  of  documents  in  which  word 
occurred  (dispersion);  (3)  selection  ratio  (ratio  of  occurrences  in 
abstractor -selected  sentences  to  frequency  in  corpus).  Tabulating 
equipment  was  used  to  separate  the  words  having  dispersion  less  than 
5  from  those  having  dispersion  5  or  greater.  The  latter  class  con¬ 
sisted  of  3,  314  words  which  were  deemed  to  be  the  source  of  function 
words  of  the  language  and  hence  candidates  for  Cue  words. 

Step  2.  Classification  of  Candidates.  Two  statistical  thresholds 
were  established  above  and  below  the  mean  selection  ratio  for  all  words. 
The  following  classes  were  then  defined  and  listed  by  means  of  a  com¬ 
puter  program: 


Null  candidates:  dispersion  greater  than  30  and  selec¬ 

tion  ratio  between  thresholds  yielded 
282  words 


Bonus  candidates:  selection  ratio  above  upper  threshold 

yielded  986  words 

Stigma  candidates:  selection  ratio  below  lower  threshold 

yielded  1, 177  words 

Residue:  dispersion  less  than  30  and  selection 

ratio  between  thresholds  yielded  869 
words 


♦ 


See  Reference 


7. 


35 


Step  3.  Review  and  Compile  Dictionary,  Veraion  1.  The 
listings  of  Step  2  were  reviewed,  and  reclassified  when  necessary. 

Two  conditions  guided  this  effort:  (1)  assignments  must  not  be 
counterintuitive;  (2)  the  computer  program  allows  only  1000  words 
in  the  Cue  Dictionary.  It  was  found  that  most  of  the  counterintuitive 
assignments  occurred  when  the  word  frequency  was  low,  giving  an 
unreliable  selection  ratio.  For  high  frequency  words,  it  was  found 
that  intuition  usually  bore  out  the  statistical  data.  The  second  con¬ 
dition  was  found  to  be  easily  met  by  considering  the  probability  of 
occurrence  of  a  word.  That  is  to  say,  it  is  possible  to  adhere  to 
the  Cue  Dictionary  size  limitation  by  casting  out  certain  low  frequency 
words  without  fear  of  loss  of  effectiveness.  It  turned  out  that  most 
of  the  Stigma  candidates  were  reclassified  as  residue  by  this  criterion. 
A  dictionary  was  compiled  of  the  Null.  Bonus,  and  Stigma  words  re¬ 
maining.  Also,  a  new  class  of  indeterminate  words  was  formed  and 
listed  for  further  study.  The  breakdown  was:  324  Null  words, 

568  Bonus  words,  21  Stigma  words,  314  indeterminate  words. 

Step  4,  Experimentation  and  Analysis.  The  Cue  Dictionary, 
Version  1  was  used  in  3  abstracting  experiments  (1-3}*  on  a  sample 
of  exotic  fuel  documents.  The  Key  word  and  vertical  listing  output 
were  then  studied  for  occurrences  of  Null,  Bonus -Stigma  words  and 
candidates  for  additional  Cue  Dictionary  entries.  Obvious  errors 
were  noted  and  corrected,  resulting  in  the  Cue  Dictionary,  Version  2. 

This  first  examination  of  the  effect  of  the  Dictionary  led  to 
two  decisions:  (1)  to  formulate  linguistic  guidelines  for  further  classi¬ 
fication  oriented  more  specifically  to  the  Exotic  Fuel  Corpus;  (2)  to 
create  a  statistical  data  base  (as  done  on  the  Heterogeneous  Corpus) 
from  20  documents  of  the  Exotic  Fuel  Corpus,  giving  selection  ratio 
and  frequency  data. 


See  Section  3.  4.  The  operating  system  at  this  stage  of  the  research 
did  not  have  the  capability  of  handling  negative  dictionary  weights. 
Thus,  the  Stigma  words  were  in  the  Cue  Dictionary  but  not  used. 


36 


Step  5.  Linqiiietic  Reorientation.  The  linguistic  analysis  of 
the  experimental  results  led  to  the  following  descriptions  of  word 
classifications. 

^1 

ordinals 
cardinals 
verb  "to  be" 
prepositions 

verbs  of  state  or  process 

Bonus 

comparatives  relative  interrogatives 

superlatives  causality  terms 

adverbs  of  conclusion  important  conditions  or 

value  terms  processes 

Stigma 

belittling  expressions  plurals  of  explicatory 

references  elsewhere  expressions 

insignificant-detail  hedging  expressions 

expressions 


pronouns 
adjectives 
verbal  auxiliaries 
articles 

coordinating  conjunctions 


Residue 

positives 
technical  terms 
archaic  terms 


Version  Z  was  now  reviewed  according  to  these  guidelines 
and  Version  3  produced.  The  members  of  the  indeterminate  class 
were  given  a  final  adjudication.  The  breakdown  of  this  Dictionary 
is  164  Null  words,  789  Bonus  words,  47  Stigma  words.  Version  3 
of  the  Cue  Dictionary  was  used  in  Experiments  4-11. 

Step  6.  Statistical  Reorientation.  The  statistics  for  the 
exotic  fuel  sample  (Batch  B)  were  examined.  If  a  word  had 
frequency  greater  than  or  equal  to  25  it  was  assigned  to  the  Bonus 
class  when  the  selection  ratio  was  high,  to  the  Stigma  class  if  the 
selection  ratio  was  low.  The  medium  selection-ratio  words  with 


H/L 

See  Section  3.  4.  The  reprogramming  was  completed  for  the  use 
of  Stigma  words  in  these  experiments. 


37 


exceptionally  high  frequency  were  assigned  to  the  Null  class.  The 
results  of  this  mechanical  step  were  reviewed  for  violations  of 
linguistic  plausibility.  Such  violations  were  rare.  That  is  to  say, 
the  statistics  confirmed  to  a  large  degree  the  previous  dictionary 
entries.  Finally,  the  machine  abstracts  of  Experiment  11  were 
examined.  It  was  observed  that  some  words  were  improperly  called 
Key  words,  in  particular,  units  of  measurement.  It  was  then  dis¬ 
covered  that,  by  gathering  all  word  occurrences  of  units  of  measure¬ 
ment  into  a  single  class,  the  ratio  of  selection  of  this  class  was  very 
low.  Thus  units  of  measurement  were  assigned  to  the  Stigma  class. 
Deletions  were  made  to  allow  room  for  the  new  entries  according  to 
the  criterion  of  low  probability  of  occurrence  in  the  Exotic  Fuel 

y 

Corpus.  The  final  version  (Version  4)  was  then  formed  to  incorporate 
these  changes.  The  breakdown  into  final  dictionaries  is:  139  Null 
words,  783  Bonus  words,  73  Stigma  words. 

The  Cue  weights  are  stunmarized  in  Exhibit  22,  and  a  section 
of  the  Cue  Dictionary  is  shown  in  Exhibit  4. 

(2)  Heading  Dictionary  and  Ordinal  Weights.  The  motivation 
behind  the  use  of  location  factors  stems  from  two  considerations: 

The  first  is  that  we  can  postulate  that  if  a  "topic  sentence"  exists 
in  a  paragraph  it  will  tend  to  occur  early  or  late  in  the  paragraph. 

The  second  is  that,  in  a  technical  report,  a  sentence  that  occurs 
under  certain  headings  (e.  g.  "Purpose",  "Conclusions")  has  a  strong 
chance  of  being  suitable  for  extraction.  Consequently  a  sentence 
should  receive  a  reward  for  its  position  in  a  paragraph  and  for  its 
occurrence  under  certain  headings.  The  plan  was  to  list  certain 
location  characteristics  of  sentences  and  then  to  test  whether  such 
characteristics  were  indeed  relevant  factors.  Finally,  weights  were 
assigned  to  the  characteristics.  The  two  types  of  location  factors 
mentioned  above  gave  rise  to  the  Heading  Dictionary  and  the  Ordinal 
weights.  The  compilation  of  the  Heading  Dictionary  will  be  discussed 
first.  The  work  proceeded  by  utilizing  both  linguistic  and  statistical 
considerations  (as  in  the  Cue  Dictionary  compilation). 


38 


Step  1.  Data  Collection  and  Preliminary  Classification. 

Each  heading  occurring  in  100  exotic  fuel  articles  was  listed 
separately  on  a  card  together  with  the  frequency  of  occurrence  of 
the  heading.  These  headings  were  then  classified  by  a  scheme 
developed  in  terms  of  the  defined  information  content  of  abstracts. 

The  classification  was:  S:  subject  matter  headings;  P,  purpose 
headings:  M,  method  headings;  C,  conclusion  headings;  G,  generali¬ 
zation  headings;  and  R,  recommendation  headings.  That  is  to  say,  on 
the  basis  of  the  headings  classified  under  an  S-classed  heading  we 
expect  to  find  general  subject  matter  sentences,  under  aP-classed 
heading  we  expect  to  find  sentences  concerning  the  purpose  of  the 
article,  etc.  Another  class  F  was  defined  comprising  headings  of  a 
purely  functional  nature  under  which  we  expect  to  find  summarizing 
sentences  (e.  g.  ,  "Abstract",  "Summary").  The  remaining  cards 
were  classified  as:  N,  no  use  in  abstract  (e.  g.  "Appendix");  I, 
indeterminate  (classification  is  possible,  but  the  correct  assignment 
is  doubtful);  U,  unclassified  (no  classification  applies). 

Step  2.  Organization  of  Word  Data.  An  alphabetic  listing  was 
was  made  of  all  words  that  occurred  in  the  headings  with  the  exception 
of  prepositions,  articles,  and  highly  specific  words.  *  The  words 
were  given  with  frequency  of  occurrence,  frequency  in  combination, 
total  frequency,  and  distribution  data  of  the  classification  of  step  i; 

156  words  were  thus  obtained. 

Step  3.  Statistical  Augumentation.  Next,  20  documents  of  the 
Exotic  Fuel  Corpus  (Batch  B)  were  selected  for  detailed  study.  An 
outline  of  each  document  was  prepared  showing  the  niunber  of 
sentences  occurring  under  each  heading.  The  target  abstracts  of 
these  documents  were  then  examined  for  the  niunber  of  sentences 
selected  under  each  heading.  It  was  found  that  31  heading  words  of 
the  previous  listing  occurred  in  this  document  sample.  For  each  of 


Content  words  were  included  if  they  had  the  slightest  generality 
(e.  g.  ,  "facilities",  "materials").  That  is  to  say,  caution  was 
exercised  in  preserving  the  peculiarities  of  the  corpus. 


39 


these  words  the  selection  ratio  was  computed,  i.  e.  ,  the  ratio  of 
the  number  of  abstractor- selected  sentences  occurring  under  the 
heading  word  to  the  total  number  of  sentences  occurring  under  it. 
The  frequency  of  occurrence  of  the  heading  words  in  the  sample 
was  also  listed  as  a  guide  to  the  reliability  of  the  selection  ratio. 

Step  4.  Assignment  of  Weights.  The  data  of  Steps  2  and  3 
were  then  examined  and  a  weight  was  assigned  to  each  heading 
word.  It  was  found  that  the  selection  ratios  confirmed  the 
linguistic  appraisal  of  word  importance.  Additions  and  deletions 
were  made:  the  former  by  listing  variants  of  important  words  or 
meaning  equivalents,  the  latter  by  the  criterion  of  low  frequency. 
The  final  Heading  Dictionary  comprised  90  words  (see  Exhibit  5). 

Ordinal  Weights.  Some  300  sentences  were  picked  at 
random  from  the  Heterogeneous  Corpus  and  selection  ratios  com¬ 
puted.  The  data  are  as  follows: 


Selection 

Ordinal  Characteristics  Ratio 


First  sentence  of  first  paragraph  .  60 

Intermediate  sentence  of  first  paragraph  .  35 

Last  sentence  of  first  paragraph  .  38 

First  sentence  of  intermediate  paragraph  .  44 

Intermediate  sentence  of  intermediate  paragraph  .  20 
Last  sentence  of  intermediate  paragraph  .  17 

First  sentence  of  last  paragraph  .  38 

Intermediate  sentence  of  last  paragraph  .  23 

Last  sentence  of  last  paragraph  .  24 


These  data  served  to  establish  weights  for  the  ordinal  characteristics: 
first  paragraph,  last  paragraph,  first  sentence,  last  sentence.  We 
may  think  of  the  weights  of  intermediate  sentences  of  paragraphs  as 
being  zero.  The  adopted  weights  (see  Exhibit  22)  for  the  additive 
weight  system  reflect  the  gross  behavior  of  the  statistical  data. 


40 


(3)  Title  Weights.  The  present  research  involved  the  creation 
of  two  new  techniques  relying  upon  two  sentence  characteristics  not 
previously  investigated.  The  first  of  these  is  founded  upon  words  of 
the  title  and  subtitles  (i.  e. ,  headings)  and  hence  is  called  the  Title 
method.  It  is  based  upon  purely  linguistic  considerations  stenuning 
from  Principle  1  mentioned  in  Section  3.  3.  2.  Here  we  rely  upon  the 
fact  that  a  skilled  author  conceives  of  the  title  as  circumscribing  the 
subject  matter  of  the  document.  Thus,  whether  or  not  the  author 
incorporates  his  own  abstract  at  the  beginning  of  the  document,  he  in 
effect  conceives  a  title  that  can  be  viewed  as  an  abstract  of  the 
abstract  of  the  document.  Similarly,  when  he  partitions  the  body 
of  the  document  into  major  sections  he  summarizes,  by  choosing 
the  proper  words  to  form  his  subtitles.  Thus,  it  is  believed  that 
the  content  words  of  title  and  subtitles  contain  an  important  source 
of  clues  for  an  automatic  abstracting  system.  In  an  examination 
of  this  hypothesis,  title  and  subtitle  words  were  verified  as  being 
statistically  relevant  characteristics.  The  hypothesis  that  such 
content  words  are  irrelevant  can  be  rejected  at  the  1%  level  of 
significance. 

The  Title  method  then  creates,  by  a  computer  program,  a 
Title  Glossary  consisting  of  the  content  words  of  the  title  and  sub¬ 
titles  of  the  document.  Words  in  the  main  body  are  then  matched 
against  the  Title  words  and  a  match  awards  each  sentence  a  positive 
weight.  The  weights  assigned  to  the  words  of  the  Title  Glossary 
are  based  on  the  consideration  of  their  effect  in  the  combined 
weighting  scheme  of  the  four  methods.  It  was  decided  that  content 
words  of  the  title  should  outweigh  content  words  of  the  headings. 
Therefore,  the  former  were  initially  given  the  weight  of  20  and  the 
latter  a  weight  of  10.  This  assignment  of  weights  however,  led  to 
a  difficulty  in  the  ranking  of  all  sentences  of  the  document  when 
the  Title  method  was  used.  This  was  due  to  20  being  an  exact 
multiple  of  10,  which  caused  many  ties  among  sentences  weights. 


41 


To  avoid  the  occurrence  of  numerous  ties,  the  following 
device  was  employed.  Title  words  were  assigned  the  weight  of  11, 
and  heading  words  the  weight  of  7  (see  Exhibit  22).  This  arithmetic 
trick  is  based  upon  the  fact  that  11  and  7  are  relatively  prime  and 
thus  the  probability  of  a  tie  is  extremely  small. 

(4)  Key  Word  Weights.  The  principle  of  the  Key  method  was 
the  first  one  proposed  for  the  creation  of  automatic  abstracts. 

The  previous  study  used  the  following  definition  of  Key  words: 
Candidates  for  Key  words  were  first  selected  as  all  non-Cue  words 
that  occurred  in  the  top  25  per  cent  Cue-weighted  sentences.  The 
Key  word  candidates  were  then  frequency-counted  over  this 
collection  of  sentences  and  ranked  in  order  of  frequency.  Key  words 
were  defined  as  those  candidates  which  totaled  the  first  100  occur¬ 
rences  and  the  Key  weights  were  taken  to  be  the  frequency  of 
occurrence  over  that  collection.  The  present  study  led  us  to 
change  both  of  these  conditions  so  that  in  the  present  system  Key 
words  are  chosen  from  among  the  top  10  per  cent  of  the  total 
number  of  words  in  the  document,  and  secondly  their  frequency  of 
occurrence  is  computed  over  all  words  of  the  entire  text.  The  Key 
weight  of  a  word  is  taken  to  be  its  frequency  of  occurrence  in  the 
document  (see  Exhibit  22). 

It  is  felt  that  the  change  from  constant  threshold  of  100 
occurrences  to  a  fractional  threshold  of  10  per  cent  is  an  improve¬ 
ment.  Moreover,  both  statistical  and  linguistic  investigation  have 
supported  the  shift  from  the  more  narrow  environment  of  high  Cue- 
weighted  sentences  to  the  wider  environment  of  all  text  words. 

3.  4  EXPERIMENTS 

In  this  section  we  describe  17  experiments  conducted  in  the 
course  of  the  research.  These  experiments  can  be  classified  into 
4  groups  which  represent  project  milestones.  A  chart  of  the 
experiments  is  given  in  Exhibit  23.  The  final  production  runs  are 


42 


also  represented  as  well  as  the  initial  system  checkout.  Thus  the 
chart  gives  a  history  of  the  experiments,  introduction  of  the  various 
methods,  and  modification  of  the  weighting  rules. 

A  resvime  of  the  experiments  now  follows. 


3.  4.  1  Preliminary  Test 


To  facilitate  the  research  planned  in  this  study  the  computer 
programs  used  in  the  previous  work  were  converted  to  the  Space 
Technology  Laboratories  computer  system  (see  Ref.  9)  and  the 
research  output  (e.  g.  vertical  listing,  etc. )  was  incorporated  in  the 
program.  Upon  completion  of  these  tasks  an  initial  system  test  was 
conducted.  An  exotic  fuel  document  was  taken  through  the  system 
from  copy,  pre-edited  version,  keypunched  deck,  edit  (transfer  to 
magnetic  tape),  to  machine  abstract  with  satisfactory  results. 
Format  changes  were  made  in  the  vertical  listing  printout  (see 
Exhibit  16). 


3.4. 


2  Experimental  Cycles 

Group  I.  The  purposes  of  these  experiments  were: 

(1)  to  study  the  effect  of  the  Cue  Dictionary,  Version 
on  exotic  fuel  documents. 


1. 


(2)  to  introduce  the  Title  method  both  in  isolation  and 
combination  with  the  Cue  and  Key  methods. 

(3)  to  generate  a  sample  of  automatic  abstracts  of 
exotic  fuel  documents  for  use  in  a  pilot  evaluation 
study  (see  Section  3.  5).  The  results  of  the  Cue 
Dictionary  study  are  reported  in  Section  3.  3.  4. 

A  review  of  the  Title  method  output  confirmed  the 
conjecture  that  title  and  heading  words  play  a 
significant  role  in  the  abstracting.  A  review  of 
the  Key  word  lists  led  to  a  modification  of  the 
Key  word  definition. 

Group  II.  After  the  third  Cue  Dictionary  compilation  (see 
Section  3.  3.  4)  and  modification  of  the  Key  word  definition,  it  was 
decided  to  do  a  more  complete  series  of  experiments.  The  purpose  was 


43 


(1)  to  study  the  effect  of  the  Cue  Dictionary,  Version  3. 

(2)  to  study  all  possible  combinations  of  the  three 
methods,  Cue  (C),  Key  (K),  Title  (T): 

a.  All  three  in  combination  (Experiment  4) 

b.  In  pairs  (Experiments  5-7) 

c.  In  isolation  (Experiments  8-10) 

(3)  to  study  the  effect  of  the  modified  Key  word 
definition. 


A  comparison  of  the  machine  abstracts  with  the  target 
abstracts  in  Experiment  4  (C-K-T)  revealed  that  the  machine 
technique  was  selecting  37  per  cent  of  the  target  abstract  sen¬ 
tences.  A  study  of  the  printouts  led  to  the  following  decisions: 


( 1)  The  format  of  the  abstracts  should  be  changed  to 
include  all  headings  together  with  the  selected 
sentences.  Because  of  the  heading  structure  of  the 
technical  reports  under  consideration,  it  was  noted 
that:  (a)  if  a  machine  abstract  had  a  comparable 
structure  it  could  more  easily  serve  the  screening 
function  of  abstracts,  (b)  this  structure  would  help 
clarify  the  meaning  of  a  sentence  lifted  out  of 
context. 

(2)  The  Title  weights  should  be  changed  to  give  better 
discrimination  between  sentences.  The  arithmetic 
trick  described  in  Section  3.3.4  was  used. 


(3)  The  Location  method  should  be  tested  in  combi¬ 
nation  with  the  Cue,  Key,  and  Title  methods  before 
final  resolution  of  the  Key  word  definition  (which  was 
still  unsatisfactory). 

Group  III.  This  consisted  of  a  single  experiment  •(  1 1)  for  the 
purpose  of  studying  the  new  format,  using  a  corrected  Version  3  of 
the  Cue  Dictionary,  *  and  testing  the  new  Location  method  in 


*It  was  found  that  51  Cue  words  had  not  been  transferred  to  the 
Dictionary  tape. 


44 


combination.  A  comparison  with  target  abstracts  showed  that  44  per 
cent  of  the  target  abstract  sentences  were  now  being  selected  by  the 
machine.  *  It  was  apparent  that  the  Location  factors  made  a  significant 
improvement  in  the  technique. 

A  review  of  the  data  Groups  I,  U,  III  suggested  that  another 
group  of  experiments  would  be  adequate  to  choose  the  final  abstracting 
techniques  in  accordance  with  the  scope  of  the  present  effort.  It 
was  decided  that 

(1)  The  abstract  format  was  satisfactory. 

(2)  The  Cue  Dictionary,  Version  4  (obtained  by  the 
statistical  revision  of  Version  3)  was  adequate. 

(3)  A  satisfactory  Key  word  list  would  be  obtained  by 
taking  the  shortest  list  of  Key  words  that  comprises 
10  per  cent  of  the  total  number  of  word  occurrences 
in  the  document. 

(4)  The  Title  weights  were  satisfactory. 

(5)  The  Heading  Dictionary  was  satisfactory. 

Group  IV.  These  experiments  were  designed  to  test  the  above 
decisions.  The  results  are  given  in  two  studies. 

First,  the  percent  of  sentences  in  common  with  the  target 
abstracts  was  computed.  The  mean  percentages  are  shown  below 
together  with  the  intervals  encompassing  the  mean  plus  or  minus  one 
standard  deviation.  The  data  for  a  random  selection  of  25  per  cent  of 
the  sentences  is  given  for  comparison.  It  is  to  be  noted  that  the  C-T-L 
method  has  the  highest  mean  value  while  the  Key  method  in  isolation  has 
the  lowest. 


’^Headings  are  not  counted  in  the  computation  of  "overlap"  data. 


45 


'4 


C-T^-L 
C-K-T-L 

I  ...Mt,,.  ■  - 1 

Location 

I  '  I  " " 

Cue 

I  I  I 

Title 
Key 

I - 1  ■  '  — 1 

Random 

t— ( — >  t  1 ( — I — >>>»!><  .  ♦  .  t  ■■■■< 

0  10  20  30  40  50  60  70  80  90  100 

On  the  basis  of  these  data  it  was  decided  to  omit  the  Key  word 
component  in  the  final  abstracting  system.  The  data  confirms  a 
hypothesis  set  forth  previously*  that  Key  words,  while  important  for 
indexing,  may  not  be  so  important  for  abstracting.  This  decision  has 
important  consequences  for  an  abstracting  system:  considerable 
simplification  can  be  achieved  in  the  computer  program  if  frequency- 
couiating  the  entire  text  can  be  avoided. 

The  second  study  involved  a  detailed  comparison  of  the  abstracts 
produced  by  the  C-T-L  method  with  the  target  abstracts.  We  can 
uniquely  classify  every  sentence  in  a  document  by  considering  all 
combinations  of  the  properties:  worthy  to  be  in  an  abstract,  in  the 
target  abstract,  in  the  machine  abstract,  and  the  negations  of  these 
properties.  Eight  classes  result,  of  which  two  define  the  most 
significant  machine  errors,  namely: 

Type  1  Error:  Sentence  is  worthy  and  in  the  target 
abstract,  but  is  not  in  machine  abstract. 

Type  2  Error:  Sentence  is  not  worthy  and  is  not  in 
the  target  abstract  but  is  in  the  machine  abstract. 

Exhibit  24  gives  the  tabular  breakdown  of  the  analysis  of  the  C-T-L 
abstracts  of  Experiment  16. 


Ref. 


7 


46 


I  (0 
^  4) 

o  ^  o  S 
H  <  ^  w 


oiTj^or^ 


N^OfOsO  CO  *POOOfO 

^  ^  ^  ^  ^  ^ 


fomoo^J  ^oopo^  r^fvjo^po  tn^^rvio 


<>  w 


^J^^J^^oN  otn^oo  ooofvj-^ 


irj^oo^p*^  mo'O^vjpo  int^mo^po  poino^^rvj 


Nt^mfMPM  vO^O^^O 
^  Csl  ^  ^  Oj 


If)  vO  vo  r^  ^ 
^  fvj 


^  ^  tJ*  C^ 

fSj  ^  ^ 


iDr-mrooo  ^pot^jvj^  oovou^ir»vo 


Exhibit  24.  Analysis  of  C-T-L  Selected  Sentences 


Total:  116  ;311  177  32  52  261 


3.  5  EVALUATION 


A  sample  of  40  docviments  of  the  Test  Library  (documents  not 
used  in  the  abstracting  experiments)  was  selected  for  evaluation.  The 
purpose  was  to  arrive  at  a  gross  evaluation  of  the  quality  of  the 
machine  product.  To  this  end  an  evaluation  procedure  was  designed 
based  on  (1)  the  definition  of  target  abstract,  and  (2)  the  rating  of  the 
degree  of  similarity  between  two  abstracts.  Random  extracts  were 
used  as  controls.  The  Instructions  to  Raters  are  shown  in  Exhibit  25 
and  a  Rating  Form  is  shown  in  Exhibit  26. 

3.  5.  1  Rating  Procedure 

(1)  Uniform  Presentation  of  Materials.  The  machine  abstract, 
target  abstract,  and  randomly  generated  extract  were  typed  in  the 
same  format.  Typing  was  necessary  because  machine  printouts  of  the 
target  abstract  and  random  extract  were  not  available.  Headings  were 
included  as  well  as  paragraph  and  sentence  designations.  The  target 
abstract  was  identified,  but  the  machine  abstract  and  random  extract 
were  code  designated. 

(2)  Similarity  Rating.  The  raters  judged  the  similarity  between 
the  target  abstract  and  each  of  the  other  two  according  to  the  Instruc¬ 
tions  for  Raters. 

(3)  Scoring.  The  ratings  were  scored  by  giving  4  points  for 
complete  similarity,  3  points  for  considerable  similarity,  .  .  .  , 
and  0  points  for  no  similarity. 

A  maximum  possible  score  was  next  computed  by  considering 
the  number  of  not -applicable  information  types  (none  gives  a  maximum 
of  24,  1  gives  a  maximum  of  20,  etc.  ).  By  dividing  the  total  score  by 
the  maximum  possible  a  normalized  score  was  then  obtained.  This 
was  interpreted  as  the  degree  of  similarity  between  the  abstracts  and 
was  recorded  as  a  percentage. 


47 


(4)  Participants.  Two  raters  were  used.  The  first  rater  was 
familiar  with  the  40  documents.  The  second  was  not  associated  with 
the  project  except  for  the  evaluation  test.  The  resulting  ratings, 
reported  below,  justified  the  decision  that  a  third  rater  was  not 
required.  The  high  consistency  of  the  raters  is  shown  by  the  small 
variation  of  the  mean  scores  in  three  samples  of  10  documents  each. 


Target  vs.  Machine 

Target  vs.  Random 

Samples 

Rater  1 

Rater  2 

Rater  1 

Rater  2 

Sample  1 
(n  =  10) 

72% 

85% 

39% 

38% 

Sample  2 
(n  =  10) 

53% 

56% 

37% 

30% 

Sample  3 
(n  =  10) 

66% 

61% 

32% 

29% 

The  scores  of  the  two  raters  were  averaged  for  a  single  similarity 
score.  Thus  for  each  docvutient  two  calculations  were  made:  the 
percent  similarity  between  Target  and  Machine,  and  percent 
similarity  between  Target  and  Random. 

(5)  Sampling  and  Assembling  of  Data.  The  documents  were 
divided  into  4  samples  of  10  each.  The  purpose  was  to  obtain  a 
reliable  mean  similarity  rating  by  working  through  these  samples. 
It  was  found  that  3  samples  were  sufficient.  For  each  sample  the 
mean  similarity  rating  was  computed  as  well  as  the  standard 
deviation  of  the  mean.  The  following  table  shows  the  standard 
deviations  of  the  similarity  ratings  for  cumulative  samples. 


48 


INSTRUCTIONS  FOR  RATERS 


You  are  asked  to  read  a  pair  of  abstracts  of  a  document  and 
then  to  judge  the  degree  of  similarity  of  the  information  contained  in 
the  two  abstracts.  It  is  not  necessary  that  you  understand  the  subject 
under  discussion:  only  that  you  recognize  the  kind  of  information  given. 

Please  compare  the  two  abstracts  with  regard  to  the  following 
six  information  types: 

Subject  Matter.  Information  indicating  the  general  subject  area 
that  is  the  author's  principal  concern;  i.  e.  ,  what? 

Purpose.  Information  indicating  whether  the  author's  principal 
intent  is  to  offer  original  research  findings,  to  survey  or  to 
evaluate  the  work  performed  by  others,  to  present  a  speculative 
or  theoretical  discussion,  or  to  serve  some  other  main  purpose; 
i.  e.  ,  why? 

Methods.  Information  indicating  the  methods  used  in  conducting 
the  research.  Depending  on  the  type  of  research,  such  state¬ 
ments  may  refer  to  experimental  procedures,  mathematical 
techniques,  or  other  methods  of  scientific  investigation;  i.  e.  , 
how? 

Conclusions  or  Findings.  Information  indicating  the  results 
obtained  in  the  research  or  the  findings  of  the  author. 

Generalizations  or  Implications.  Information  indicating  the 
significance  of  the  research  and  its  bearing  on  broader 
technical  problems  or  theory. 

Recommendations  or  Suggestions.  Information  indicating  recom- 
mended  courses  of  action  or  suggested  areas  of  future  work. 

Using  the  following  rating  scale,  place  a  check  in  the  box  on  the 
rating  form  (see  Exhibit  26)  corresponding  to  one  of  the  degrees  listed 
below  that  best  indicates  the  degree  of  similarity  between  the  two 
abstracts  - 


Not  applicable 
No  similarity 
Moderate  similarity 
Considerable  similarity 
Complete  similarity 


Exhibit  25.  Instructions  for  Raters 


I 

I 

I 

! 

I 


a 

u 

Pi 


Exhibit  26.  Rating  Form 


endations 


Target  vs.  Machine 

Target  vs.  Random 

First  Determination: 

m  =  79% 

3 

II 

00 

Sample  1 
(n  =  10) 

8  =  4% 

m 

s  =  4% 
m 

Second 

Determination: 

m  =  67  % 

m  =  36% 

Samples  1,  2 
(n  =  20) 

s  =  5  % 

m 

8  =  4% 

m 

Third 

Determination: 

m  =  66  % 

m  =  34% 

Samples  1,  2,  3 
(n  =  30) 

CD 

3 

n 

s  =  3% 

m 

Overlap  data  (i.  e.  ,  percent  agreement  between  Target  and  Machine) 
for  cumulative  samples  1,  2,  and  3  gives  44  percent  agreement 
between  machine  and  target.  Thus  we  have  the  following  correspond¬ 
ence  between  similarity  ratings  and  overlap  data: 


Similarity 

100 

66 

34 


%  Overlap 

100  (defined) 

44  (computed) 

25  (computed) 


(6)  Conclusions,  (a)  The  mean  similarity  rating  between 
the  target  and  machine  abstracts  is  66  per  cent;  the  standard  deviation 
of  the  mean  is  3  per  cent,  (b)  The  mean  similarity  rating  between 
the  target  and  the  randomly  generated  abstracts  is  34  per  cent;  the 
standard  deviation  of  the  mean  is  3  per  cent. 


3.  5.  2  Comments  on  Evaluation  Procedures 

Because  no  attempt  was  made  to  evalviate  the  utility  of  the 
target  abstracts,  the  above  findings  (that  the  machine  abstract 
mirrors  the  target  abstract  at  a  66%  level)  do  not  evaluate  the 
utility  of  the  machine  abstracts.  No  conclusions  about  the  utility 


49 


of  the  machine  abstracts  can  be  drawn  from  the  relative  evaluation 
criteria  utilized  in  this  effort.  However,  it  is  interesting  to  analyze 
the  reasons  for  the  apparent  useful  qualities  of  the  machine  abstracts. 

A  sentence-by-sentence  analysis  is  summarized  in  Exhibit  24.  Such 
an  analysis  is  valuable  as  a  supplement  to  the  similarity  rating  because 
a  high  similarity  may  exist  in  specific  sentences  without  coherence  and 
a  low  similarity  rating  may  be  based  on  a  poor  representative  of  an 
information  type. 

The  number  of  agreements  with  target  sentences  shows  only  part 
of  the  actual  correspondence  between  the  machine  and  human  abstracts. 
When  the  number  is  expanded  to  include  the  machine -selected  sentences 
cointensional  with  target  sentences,  the  total  represents  the  degree  to 
which  the  machine  process  included  the  information  of  the  target 
abstract.  The  category  of  "Additional  Worthy  Sentences"  includes 
those  which  might  be  included  in  an  abstract  of  unrestricted  length. 

This  group  and  the  preceding  two  comprise  the  "Total  of  Abstract- 
Worthy  Sentences",  which  may  be  taken  to  represent  that  part  of  the 
machine  abstract  which  conforms  to  the  content  aspect  of  the  working 
definition  of  "abstract".  The  average  of  84  percent,  even  though 
uncorrected  for  redundancy  in  the  machine  abstract,  seems  to  repre¬ 
sent  a  highly  promising  achievement  in  the  automation  of  the 
abstracting  process. 

The  sentences  selected  by  the  machine  process  but  not  abstract¬ 
worthy  (Type  2  errors)  are  extraneous  detail  and  represent  "noise". 

They  clutter  the  abstract  and  often  interfere  seriously  with  coherence. 

To  minimize  this  group  should  be  one  goal  of  future  research.  The 
sentences  resulting  in  Type  1  errors  represent  information  included 
in  the  target  abstract  but  not  in  the  machine  abstract.  Their 
significance  can  only  be  discovered  by  looking  at  the  sentence  in 
question. 

Future  research  should  involve  statistical  analysis  of  the  two  error- 
types  with  the  purpose  of  modifying  the  program  to  minimize  them.  Study 
should  be  made  to  discover  machine- recognizable  clues  to  determine  the 


50 


proper  length  of  an  abstract.  The  extent  to  which  redundancy 
appears  in  the  machine  abstract  and  ways  of  mechanizing  its 
suppression  should  be  investigated.  Linguistic  clues  to  coherence 
should  be  investigated  and  expressed  in  machine-recognizable  form, 
perhaps  in  the  form  of  a  word-and-phrase  dictionary  indicating  the 
need  for  selecting  an  antecedent  sentence. 

In  short,  the  main  differences  between  human  and  machine 
abstracts  (see  Section  3.  3.  1)  can  now  be  described  and  tabulated 
and  procedures  outlined  to  minimize  them.  However,  in  the  last 
analysis,  progress  in  the  creation  of  automatic  abstracts  must  be 
verified  by  the  users. 


REFERENCES 


1.  Proposal  for  a  Study  of  Automatic  Abstracting,  Attachment  III  - 
Technical  Discussion,  P705-9U3,  2  December  1959,  Ramo- 
Wooldridge,  a  Division  of  Thompson  Ramo  Wooldridge  Inc.  , 

Canoga  Park,  California. 

2.  Study  for  Automatic  Abstracting,  Quarterly  Memorandum, 

AF  30(602)-2223,  17  June  I960. 

3.  Study  for  Automatic  Abstracting,  Quarterly  Memorandum, 

AF  30(602)-2223,  17  September  I960. 

4.  Proposal  for  Engineering  Change  "A"  to  Contract  AF  30{602)-2223 
(Study  for  Automatic  Abstracting),  Attachment  II  -  Technical 
Discussion,  P1214-0U2.  2  December  I960,  Ramo- Wooldridge, 

a  Division  of  Thompson  Ramo  Wooldridge  Inc.  ,  Canoga  Park, 
California. 

5.  Study  for  Automatic  Abstracting,  Quarterly  Memorandum  and 
Interim  Technical  Report,  AF  30(602). 2223,  C107-1U6,  17  March 
I960,  Ramo -Wooldridge,  a  Division  of  Thompson  Ramo  Wooldridge 
Inc.  ,  Canoga  Park,  California. 

6.  Study  for  Automatic  Abstracting,  Quarterly  Memorandum 
AF  30(602)-2223,  17  June  1961. 

7.  Study  for  Automatic  Abstracting,  Final  Report,  AF  30(602) -2223 
C107-1U12,  1  September  1961,  Ramo- Wooldridge,  a  Division  of 
Thompson  Ramo  Wooldridge  Inc.  ,  Canoga  Park,  California. 

8.  Proposal  for  Engineering  Change  "B"  to  Contract  AF  30(602)-2223 
(Study  for  Automatic  Abstracting),  Attachment  II  -  Technical  Dis¬ 
cussion,  P1397-1U2,  1  October  1961,  Ramo- Wooldridge,  a  Division 
of  Thompson  Ramo  Wooldridge  Inc.  ,  Canoga  Park,  California. 

9.  Study  for  Automatic  Abstracting,  Quarterly  Memorandum, 

AF  30(602)-2223,  30  April  1962. 

10.  Study  for  Automatic  Abstracting,  Quarterly  Memorandum, 

AF  30(602)-2223,  30  July  1962. 

11.  Study  for  Automatic  Abstracting,  Quarterly  Memorandum, 

AF  30(602)- 2223,  30  October  1962. 

12.  H.  P.  Edmundson  and  R.  E.  Wyllys,  Automatic  Abstracting  and 
Indexing- -Survey  and  Recommendations,  Communications  of  the 
Association  for  Computing  Machinerv.  vol.  4.  no.  bo".  226-234. 


52 


13.  H.  P.  Edmundson,  "A  Statistician's  View  of  Linguistic  Models 
and  Language  Data  Processing,  "  Natural  Language  and  the 
Computer,  P.  Garvin  (editor),  to  be  published  by  McGraw-Hill. 

14.  H.  P.  Edmundson,  "Problems  in  Automatic  Abstracting,  " 

Joint  Man-Computer  Indexing  and  Abstracting,  20  November  1962, 
Session  13,  First  Congress  on  the  Information  System  Sciences. 

15.  J.  L.  Kuhns,  "An  Application  of  Logical  Probability  to  Problems 
in  Automatic  Abstracting  and  Information  Retrieval,  "  Joint  Man- 
Computer  Indexing  and  Abstracting,  20  November  1962,  Session  13, 
First  Congress  on  the  Information  System  Sciences. 


